A.
Original Questions
Here are the original questions for this panel as submitted to
the speakers:
1. At
the last MT Summit, Martin Kay stated that there should be "greater
attention to empirical studies of translation so that computational
linguists will have a better idea of what really goes on in
translation and develop tools that will be more useful for the end
user." Does this mean that there has been insufficient input into MT
processes by translators interested in MT? Does it mean that MT
developers have failed to study what translating actually entails
and how translators go about their task? If either of these is true,
then to what extent and why? New answers and insights for the MT
profession could arise from hearing what human translators with an
interest in the development of MT have to say about these matters.
It may well turn out that translators are the very people best
qualified to determine what form their tools should take, since they
are the end users.
2. Is
there a specifically "human" component in the translation process
which MT experts have overlooked? Is it reasonable for theoreticians
to envision setting up predictable and generic vocabularies of
clearly defined terms, or could they be overlooking a deep-seated
human tendency towards some degree of ambiguityindeed, in those
many cases where not all the facts are known, an inescapably human
reliance on it? Are there any viable MT approaches to duplicate what
human translators can provide in such cases, namely the ability to
bridge this ambiguity gap and improvise personalized, customized
case-specific subtleties of vocabulary, depending on client or
purpose? Could this in fact be a major element of the entire
translation process? Alternately, are there some more boring
"machine-like" aspects of translation where the computer can help
the translator, such as style and consistency
checking?
3. How
can the knowledge of practicing translators best be integrated into
current MT research and working systems? Is it to be assumed that
they are best employed as prospective end-users working out the bugs
in the system, or is there also a place for them during the initial
planning phases of such systems? Can they perhaps as users be the
primary developers of the system?
4. Many
human translators, when told of the quest to have machines take over
all aspects of translation, immediately reply that this is
impossible and start providing specific instances which they claim a
machine system could never handle. Are such reactions merely the
final nerve spasms of a doomed class of technicians awaiting
superannuation, or are these translators in fact enunciating
specific instances of a general law as yet not fully
articulated?
Since we
now hear claims suggesting that FAHQT is creeping in again through
the back door, it seems important to ask whether there has in fact
ever been sufficient basic mathematical research, much less
algorithmic underpinnings, by the MT Community to determine whether
FAHQT, or anything close to it, can be achieved by any combination
of electronic stratagems (transfer, AI, neural nets, Markov models,
etc.).
Must
translators forever stand exposed on the firing line and present
their minds and bodies to a broadside of claims that the next round
of computer advances will annihilate them as a profession? Is this
problem truly solvable in logical terms, or is it in fact an
intractable, undecidable, or provably unsolvable question in terms
of "Computable Numbers" as set out by Turing, based on the work of
Hilbert and Goedel? A reasonable answer to this question could save
boards of directors and/or government agencies a great deal of time
and money.
B. Supplemental
Questions
It was also envisioned that a list of
Supplemental Questions would be prepared and distributed not only to
the speakers but everyone attending our panel, even though not all
of these questions could be raised during the session, so as to
deepen our discussion and provide a lasting record of these
issues.
FAHQT: Pro and
Con
Consider the following observation on
FAHQT: "The ideal notion of fully automatic high quality translation
(FAHQT) is still lurking behind the machine translation paradigm: it
is something that MT projects want to reach." (1) Is this a true or
a false observation?
Is FAHQT merely a matter of time and
continued research, a direct and inevitable result of a perfectly
asymptotic process?
Will FAHQT ever be available on a held-held calculator-sized
computer? If not, then why not?
To what extent is the belief in the feasibility of FAHQT a form
of religion or perhaps akin to a belief that a perpetual motion
device can be invented?
Technical Linguistic
Questions
Let us suppose a writer has chosen to use Word C in a source text
because s/he did not wish to use Word A or Word B, even though all
three are shown as "synonyms." It turns out that all three of these
words overlap and semantically interrelate quite differently in the
target language. How can MT handle such an instance, fairly
frequently found in legal and diplomatic usage?
Virtually all research in both conventional and computational
linguistics has proceeded from the premise that language can be
represented and mapped as a linear entity and is therefore eminently
computable. What if it turns out that language in fact occupies a
virtual space as a multi-dimensional construct, including several
fractal dimensions, involving all manner of non-linear turbulence,
chaos, and Butterfly Effects?
Post-Editors and
Puppeteers
Let's assume you saw an ad for an Automatic Electronic Puppeteer
that guaranteed to create and produce endless puppet plays in your
own living room. There would be no need for a puppeteer to run the
puppets and no need for you even to script the plays, though you
would have the freedom to intervene in the action and change the
plot as you wished. Since the price was acceptable, you ordered this
system, but when it arrived, you found that it required endless
installation work and calls to the manufacturers to get it working.
But even then, you discovered that the number of plays provided was
in fact quite limited, your plot change options even more so, and
that the movements of the puppets were jerky and unnatural. When you
complained, you were referred to fine print in the docs telling you
that to make the program work better, you would have to do one of
two things: 1) master an extremely complex programming language or
2) hire a specially trained puppeteer to help you out with your
special needs and to be on hand during your productions to make the
puppets move more naturally. Does this description bear any
resemblance to the way MT has functioned and been promoted in recent
years?
A Practical Example
Despite many presentations on linguistic, electronic and
philosophical aspects of MT at this conference, one side of
translation has nonetheless gone unexplored. It has to do with how
larger translation projects actually arise and are handled by the
profession. The following story shows the world of human translation
at close to its worst, and it might be imagined at first glance that
MT could easily do a much better job and simply take over in such
situations, which are far from atypical in the world of translation.
But, as we shall see, such appearances may be deceptive. To our
story:
A French electrical firm was recently involved in a hostile
take-over bid and law suit with its American counterpart. Large
numbers of boxes and drawers full of documents all had to be
translated into English by an almost impossible deadline.
Supervision of this work was entrusted to a paralegal assistant in
the French company's New York law firm. This person had no previous
knowledge of translation. The documents ran the gamut from highly
technical electrical texts and patents, records of previous law
suits, company correspondence, advertisements, product
documentation, speeches by the Company's directors, etc.
Almost every French-to-English translator in the NYC area was
asked to take part. All translators were required to work at the law
firm's offices so as to preserve confidentiality. Mere translation
students worked side by side with newly accredited professionals and
journeymen with long years of experience. The more able quickly
became aware that much of the material was far too difficult for
their less experienced colleagues. No consistent attempt was made to
create or distribute glossaries. Wildly differing wages were paid to
translators, with little connection to their ability. Several
translation agencies were caught up in a feverish battle to handle
most of the work and desperately competed to find translators.
No one knows the quality of the final product, but it cannot have
been routinely high. Some translators and agencies have still not
been fully paid. As the deadline drew closer, more and more boxes of
documents appeared. And as the final blow, the opposing company's
law firm also came onto the scene with boxes of its own documents
that needed translation. But these newcomers imposed one nearly
impossible condition, also for reasons of confidentiality: no one
who had translated for the first law firm would be permitted to
translate for them.
Now let us consider this true-life tale, which occurred just
three months ago, and see howor whetherMT could have handled
things better, as is sometimes claimed. Let's be generous and remove
one enormous obstacle at the start by assuming that all these cases
of documents were in fact in machine-readable form (which, of
course, they weren't). Even if we accord MT this ample handicap,
there are still a number of problems it would have had trouble
coping with:
1. How could a sufficient number of competent post-editors be
found or trained before the deadline?
2. How could a sufficiently large and accurate MT dictionary be
compiled before the deadline? Doesn't creating such a dictionary
require finishing the job first and then saving it for the next job,
in the hope that it will be similar ?
3. The simpler Mom & Pop store & smaller agency structure
of the human translation world was nonetheless able to field at
least some response to this challenge because of its large slack
capacity. Would an enormously powerful and expensive mainframe
computer have the same slack capacity, i.e., could it be kept
inactive for long periods of time until such emergencies occurred?
If so, how would this be reflected in the prices charged for its
services?
4. How would MT companies have dealt with the secrecy
requirement, that translation must be done in the law firm's
office?
5. How would an MT Company comply with the demand of the second
law firm, that the same post-editors not be used, and still land the
job?
6. Supposing the job proved so enormous that two MT firms had to
be hiredassuming they used different systems, different glossaries,
different post-editors, how could they have collaborated without
creating even more work and confusion?
Larger Philosophical
Questions
Is it in any final sense a reasonable assumption, as many
believe, that progress in MT can be gradual and cumulative in scope
until it finally comes to a complete mastery of the problem? In
other words, is there a numerical process by which one first masters
3% of all knowledge and vocabulary building processes with 85%
accuracy, then 5% with 90% accuracy, and so on until one reaches 99%
with 99% accuracy? Is this the whole story of the relationship
between knowledge and language, or are there possibly other factors
involved, making it possible for reality to manifest itself from
several unexpected angles at once. In other words, are we dealing
with language as a linear entity when it is in fact a
multi-dimensional one?
Einstein maintained that he didn't believe God was playing dice
with the universe. Is it possible that by using AI rule-firing
techniques with their built-in certainty and confidence values,
computational linguists are playing dice with the meaning of the
that universe?
It would be possible to design a set of "Turing Tests" to gauge
the performance of various MT systems as compared with human
translation skills. The point of such a process, as with all Turing
Tests, would be to determine if human referees could tell the
difference between human and machine output. All necessary
safeguards, handicaps, alternate referees, and double blind
procedures could be devised, provided the will to take part in such
tests actually existed. True definitions for cost, speed, accuracy,
and post-editing needs might all have at least a chance of being
estimated as a result of such tests. What are the chances of their
taking place some time in the near future?
"Computerization is the first stage of the industrial revolution
that hasn't made work simpler." Does this statement, paraphrased
from a book by a Harvard Business School professor, (2) have any
relevance for MT? Is it correct to state that several current MT
systems actually add one or more levels of difficulty to the
translation process before making it any easier?
While translators may not be able to articulate precisely what
kind of interface for translation they most desire, they can
certainly state with great certainty what they do NOT want. What
they do not want is an interface that is any of the following:
harder to learn and use than conventional
translation;
more likely to make mistakes than the above;
lending less prestige than the above;
less well paid
than the above.
Are these also concerns for MT developers?
What real work has been done in the AI field in terms of treating
translation as a Knowledge Domain and translators as Domain Experts
and pairing them off with Knowledge Engineers? What qualifications
were sought in either the DE's or the KE's?
Are MT developers using the words "asymptote" and "asymptotic" in
their correct mathematical sense, or are they rather using them as
buzzwords to impart a false air of mathematical precision to their
work? Is the curve of their would-be asymptote steadily approaching a
representation of FAHQT or something reasonably similar, or could it
just turn out to be the edge of a semanto-linguistic Butterfly
Effect drawing them inexorably into what Shannon and Weaver
recognized as entropy, perhaps even into true Chaos?
Must not all translation, including MT, be recognized as a subset
of two far larger sets, namely writing and human mediation? In the
first case, does it not therefore become pointless to maintain that
there are no accepted standards for what constitutes a "good
translation," when of course there are also no accepted standards
for what constitutes "good writing?" Or for that matter, no accepted
standards for what constitutes "correct writing practices," since
all major publications and publishing houses have their own in-house
style manuals, with no two in total agreement, either here or in
England. And is not translation also a specialized subset of a more
generalized form of "mediation," merely employing two natural
languages instead of one? In which case, may it belong to the same
superset which includes "explaining company rules to new employees,"
public relations and advertising, or choosing exactly the right time
to tell Uncle Louis you're marrying someone he disapproves of?
Are not the only real differences between foreign language
translation and such upscale mediation that two languages are
involved and the context is usually more limited? In either case (or
in both together), what happens if all the complexities that can
arise from superset activities descend into the subset and also
become "translation problems?" at any time? How does MT deal with
either of these cases?
Does the following reflection by Wittgenstein apply to MT: "A
sentence is given me in code together with the key. Then of course
in one way everything required for understanding the sentence has
been given me. And yet I should answer the question `Do you
understand this sentence?': No, not yet; I must first decode it. And
only when e.g. I had translated it into English would I say `Now I
understand it.'
"If now we raise the question `At what moment of translating do I
understand the sentence? we shall get a glimpse into the nature of
what is called `understanding.'" To take Wittgenstein's example one
step further, if MT is used, at what moment of translation does what
person or entity understand the sentence? When does the system
understand it? How about the hasty post-editor? And what about the
translation's target audience, the client? Can we be sure that
understanding has taken place at any of these moments? And if
understanding has not taken place, has translation?
C. Practical Suggestions for the
Future
1. The process of consultation and cooperation between working
translators and MT specialists which has begun here today should be
extended into the future through the appointment of Translators in
Residence in university and corporate settings, continued lectures
and workshops dealing with these themes on a national and
international basis, and greater consultation between them in all
matters of mutual concern.
2. In the past, many legislative titles for training and
coordinating workers have gone unused during each Congressional
session in the Department of Labor, HEW, and Commerce. If there
truly is a need for retraining translators to use MT and CAT
products, it behooves system developersand might even benefit them
financiallyto find out if such funding titles can be used to help
train translators in the use of truly viable MT systems.
3. It should be the role of an organization such as MT Summit III
to launch a campaign aimed at helping people everywhere to
understand what human translation and machine translation can and
cannot do so as to counter a growing trend towards fast-word
language consumption and use.
4. Concomitantly, those present at this Conference should make
their will known on an international scale that there is no place in
the MT Community for those who falsify the facts about the
capabilities of either MT or human translators. The fact that
foreign language courses, both live and recorded, have been
deceitfully marketed for decades should not be used as an excuse to
do the same with MT. I have appended a brief Code of Ethics document
for discussion of this matter.
5. Since AI and expert systems are on the lips of many as the
next direction for MT, a useful first step in this direction might
be the creation of a simple expert system which prospective clients
might use to determine if their translation needs are best met by
MT, human translation, or some combination of both. I would be
pleased to take part in the design of such a program.
D. Draft Code of
Ethics
1. No claims about existing or pending MT products should be made
which indicate that MT can reduce the number of human translators or
the total cost of translation work unless all costs for the MT
project have been scrupulously revealed, including the total price
for the system, fees or salaries for those running it, training
costs for such workers, training costs for additional pre-editors or
post-editors including those who fail at this task, and total costs
of amortization over the full period of introducing such a
system.
2. No claims should be made for any MT system in terms of
"percentage of accuracy," unless this figure is also spelled out in
terms of number of errors per page. Any unwillingness to recognize
errors as errors shall be considered a violation of this condition,
except in those cases where totally error-free work is not required
or requested.
3. No claim should be made that any MT system produces
"better-quality output" than human translators unless such a claim
has been thoroughly quantified to the satisfaction of all parties.
Any such claim should be regarded as merely anecdotal until proved
otherwise.
4. Researchers and developers should devote serious study to the
issue of whether their products might generate less sales
resistance, public confusion, and resentment from translators if the
name of the entire field were to be changed from "machine
translation" or "computer translation" to "computer assisted
language conversion."
5.
The computer translation industry should bear the cost
of setting up an equitably balanced committee of MT workers and
translators to oversee the functioning of this Code of Ethics.
6. Since translation is an intrinsically
international industry, this Code of Ethics must also be
international in its scope, and any company violating its tenets on
the premise that they are not valid in its country shall be
considered in violation of this Code. Measures shall be taken to
expose and punish habitual offenders.
Respectfully Submitted by Alex Gross, Co-Director Cross-Cultural
Research Projects alexilen@sprynet.com
NOTES
(1) Kimmo Kettunen, in a letter to Computational Linguistics,
vol. 12, No. 1, January-March, 1986
(2) Shoshana Zuboff: In the Age of the Smart Machine: The Future
of Work and Power, Basic Books, 1991.