Background information

International Symposium on Machine Translation and
Computer Language Information Processing, 26-28 June 1999, Beijing, China

The development and use of machine translation systems and
computer-based translation tools

John Hutchins
University of East Anglia, Norwich NR4 7TJ, England


This survey of the present demand and use of computer-based translation software concentrates on systems designed for the production of translations of publishable quality, including developments in controlled language systems, translator workstations, and localisation; but it covers also the developments of software for non-translators, in particular for use with Web pages and other Internet applications, and it looks at future needs and systems under development. The final section compares the types of translations that can be met most appropriately by human and by machine (and computer-aided) translation respectively.

Keywords: machine translation, computer-aided translation, translator workstations, multilingual systems


Types of translation demand

When giving any general overview of the development and use of machine translation (MT) systems and translation tools, it is important to distinguish four basic types of translation demand. The first, and traditional one, is the demand for translations of a quality normally expected from human translators, i.e. translations of publishable quality – whether actually printed and sold, or whether distributed internally within a company or organisation. The second basic demand is for translations at a somewhat lower level of quality (and particularly in style), which are intended for users who want to find out the essential content of a particular document – and generally, as quickly as possible. The third type of demand is that for translation between participants in one-to-one communication (telephone or written correspondence) or of an unscripted presentation (e.g. diplomatic exchanges.) The fourth area of application is for translation within multilingual systems of information retrieval, information extraction, database access, etc.

The first type of demand illustrates the use of MT for dissemination. It has been satisfied, to some extent, by machine translation systems ever since they were first developed in the 1960s. However, MT systems produce output which must invariably be revised or ‘post-edited’ by human translators if it is to reach the quality required. Sometimes such revision may be substantial, so that in effect the MT system is producing a ‘draft’ translation. As an alternative, the input text may be regularised (or ‘controlled’ in vocabulary and sentence structure) so that the MT system produces few errors which have to be corrected. Some MT systems have, however, been developed to deal with a very narrow range of text content and language style, and these may require little or no preparation or revision of texts.

In recent years, the use of MT systems for dissemination purposes has been augmented by developments in translation tools (e.g. terminology databases and translation memories), integrated in authoring and publishing processes. These ‘translation workstations’ are more attractive to human translators. Whereas, with MT systems translators see themselves as subordinate to the machine, in so far as they edit, correct or re-translate the output from a computer, with translation workstations (or workbenches) the translators are in control of computer-based facilities, which they can accept or reject as they wish.

The second type of demand – the use of MT for assimilation – has been met in the past as, in effect, a by-product of systems designed originally for the dissemination application. Since MT systems did not (and still cannot) produce high quality translations, some users have found that they can extract what they needed to know from the unedited output. They would rather have some translation, however poor, than no translation at all. With the coming of cheaper PC-based systems on the market, this type of use has grown rapidly and substantially.

With the third type – MT for interchange – the situation is changing quickly. The demand for translations of electronic texts on the Internet, such as Web pages, electronic mail and even electronic ‘chat’ lists, is developing rapidly. In this context, the possibility of human translation is out of the question. The need is for immediate translation in order to convey the basic content of messages, however poor the input. MT systems are finding a ‘natural’ role, since they can operate virtually or in fact in real-time and on-line and there has been little objection to the inevitable poor quality. Another context for MT in personal interchange is the focus of much research. This is the development of systems for spoken language translation, e.g. in telephone conversations and in business negotiations. The problems of integrating speech recognition and automatic translation are obviously formidable, but progress is nevertheless being made. In the future – still distant, perhaps – we may expect on-line MT systems for the translation of speech in highly restricted domains.

The fourth type of MT application – as components of information access systems – is the integration of translation software into: (i) systems for the search and retrieval of full texts of documents from databases (generally electronic versions of journal articles in science, medicine and technology), or for the retrieval of bibliographic information; (ii) systems for extracting information (e.g. product details) from texts, in particular from newspaper reports; (iii) systems for summarising texts; and (iv) systems for interrogating non-textual databases. This field is the focus of a number of projects in Europe at the present time, which have the aim of widening access for all members of the European Union to sources of data and information whatever the source language.