Published
by the Sci-Tech Translation Journal, October, 1993
on request of editor Gabe Bokor
MT and
Language: Conflicting Technologies?
Ariadne's Endless
Thread
Alexander
Gross language@sprynet.com |
In a previous piece (Where
Do Translators Fit Into Machine Translation?), I sought to direct a variety of philosophical,
linguistic, and practical questions to members of the MT community
during one of their major international conferences. Since response
to these questions has been less than deafening, I would now like to
suggest a few possible answers and speculations of my own concerning
these matters. Some bitterness has crept into MT discussions of
late, and so I would like to emphasize once again that no reasonable
person is opposed to MT where it works. The question is a more
theoretical one, though rich in practical applications, and concerns
how far MT is truly capable of improvement and why it has taken so
long to reach its present condition. In this discussion I propose to
deal with both MT and human language as specific "technologies," an
approach as obvious for the former as it may seem surprising for the
latter.
It is not at all hard to show that MT
comprises some sort of technology. The reduction of knowledge to
bits and bytes, the building of algorithms, the construction of
programs are all processes familiar to us from other branches of
computer technology. And indeed MT was foreseen from the beginning
by such computer pioneers as Turing, Shannon, and Weaver as a rich
potential application. Even in commercial and practical terms, MT
would appear at first glance to have passed through all the usual
stages common to technologies:
1. Need (or perceived need).
2. Determination of technological feasibility.
3. Successful financing.
4. Basic research and development.
5. Preparation and testing of prototypes.
6. Further improvements and developments.
7. Launching of commercial products.
8. Publicity and marketing.
9. Operator or consumer training in their use.
Nonetheless, a closer examination of these stages reveals several
points at which MT may have already fallen short. It can be argued,
for instance, that the "need or perceived need" for MT was never
sufficiently demonstrated, as no trustworthy figures have ever
existed concerning the actual or potential total world volume of
materials needing translation nor of the number or capabilities of
human translators ready to translate them, norfinallyof the real
or potential economic benefits to be reaped from introducing this
new method.
Further reservations may be expressed concerning the basic
"research and development" process out of which MT has grown.
Essentially all "computational linguistics" has been based in or
grown out of the prior theorizing of conventional linguistics. But
for some decades the study of linguistics, never a rigorous science
to begin with (despite some efforts to make it one), has been
subject to a process of growing decadence and obfuscation. This
process has gone so far that departments of Linguistics have
recently been disbanded at two major universities, and many scholars
now regard the field as even less respectable than sociology.
Further discussion of the linguistic side will be postponed until
we have had a chance to consider whether and, if so, how language
itself may be considered to be a technology. Further objections as
to how well MT has lived up to three other stages in our
profilenamely, launching of commercial products, publicity and
marketing, and operator or consumer trainingcan also be voiced, but
this matter will also be overlooked for the time being.
There are of course other computer-specific steps in developing a
technologysuch as reverse engineering pre-existing programs or the
use of orphan codewhich have helped to speed up the development of
applications in the past, and in most fields we have also witnessed
the effects of economies of scale. It is partly due to these last
that we have seen calculators shrink from desktop giants to the size
of visiting cards within our own lifetimes. Comparable developments
in other fields have led many to suppose that virtually anything is
possible.
At this point it is also important to note that MT is most
definitelyand perhaps most self-defininglya component part of AI,
or Artificial Intelligence. Certainly the AI Community has done all
within its power to encourage funding sources and the general public
to believe that computers can do almost anything. While MT advocates
now concedeat least among translatorsthat FAHQT (Fully Automatic
High Quality Translation) may never happen, the AI Community at
large has never made any such concession. On the contrary, at a
recent conference its so-called HAL wing proclaimed its allegiance
to recreating full human intelligenceincluding language
comprehensionwithin a computer. This is not surprising news to
those who have lurked on Internet's comp.ai newsgroup. FAHQT would
of course be a relatively simple task for such a computer, assuming
it could be built.
Now that we have seen how MT conformswith some apparent
exceptionsto the overall pattern of a technology, let us next
examine the qualifications of human language in this regard. It is
obvious from the beginning that any such claims will have to be
expressed in biological and physiological terms, since human
language did not develop in the same way as technologies such as
metallurgy or computer science, even though the latter are arguably
its offshoots.
The long-debated origins of languagevariously attributed to the
"Bow-Wow Theory," the "Yo-Heave-Ho theory," or the "Pooh-Pooh
Theory"are so inauspicious and unpersuasive that readers may wonder
what point there can belike so much else in linguisticsto any
further discussion at all. But once we turn our attention to
biological development, both of the species and of our related
animal cousins, a different perspective may unfold, and some
startling insights may just be within our view. As human beings we
frequently congratulate ourselves as the only species to have
evolved true language, leaving to one side the rudimentary sounds of
other creatures or the dance motions of bees. It may just be that we
have been missing something.
On countless occasions TV nature programs have treated us to the
sight of various sleek, furry, or spiny creatures busily spraying
the foliage or tree trunks around them with their own personal
scent. And we have also heard omniscient narrators inform us that
the purpose of this spray is to mark the creature's territory
against competitors, fend off predators, and/or attract mates. And
we have also seen the face-offs, battles, retreats, and matings that
these spray marks have incited.
In an evolutionary perspective covering all species and ranging
through millions of years, it has been abundantly shown time and
time againas tails recede, stomachs develop second and third
chambers, and reproduction methods proliferatethat a function
working in one way for one species may come to work quite
differently in another. Is it really too absurd to suggest that over
a period of a few million years the spraying mechanism common to so
many mammals, employing relatively small posterior muscles and
little brain power, may have wandered off and found its place within
a single species, which chose to use larger muscles located in the
head and lungs, guiding them with a vast portion of its brain?
This is not to demean human speech to the level of mere animal
sprayings or to suggest that language does not also possess other
more abstract properties. But would not such an evolution explain
much about how human beings still use language today? Do we really
require "scientific" evidence for such an assertion, when so many
proofs lie self-evidently all around us? One proof is that human
beings do not normally use their nether glands to spray a fine scent
on their surroundings, assuming they could do so through their
clothing. They do, however, undeniably talk at and about everything,
real or imagined. It is also clear that speech bears a remarkable
resemblance to spray, so much so that it is sometimes necessary to
stand at a distance from some interlocutors.(1)
Would not such an evolution aptly explain the attitudes of many
"literal-minded" people, who insist on a single interpretation of
specific words, even when it is patiently explained to them that
their interpretation is case-dependent or simply invalid? Does it
not clarify why many misunderstandings fester into outright
conflicts, even physical confrontations? Assuming the roots of
language lie in territoriality, would this not also go some distance
towards clarifying some of the causes of border disputes, even of
wars? Perhaps most important of all, does such a development not
provide a physiological basis for some of the differences between
languages, which themselves have become secondary causes in
separating peoples? Would it not also permit us to see different
languages as exclusive and proprietary techniques of spraying,
according to different "nozzle apertures," "colors," or viscosity of
spray? Could it conceivably shed some light on the fanaticism of
various forms of religious, political, or social fundamentalisms?
Might it even explain the bitterness of some scholarly feuding?
Of course there is more to language than spray, as the species
has sought to demonstrate, at least in more recent times, by
attempting to preserve a record of their sprayings in other media,
such as stone carvings, clay imprints, knottings in beads, and of
course scratchings on tree barks, papyri, and different grades of
paper, using a variety of notations based on characters, syllabaries
or alphabets, the totality of this quest being known as "writing."
These strivings have in turn led to the development of a variety of
knowledge systems, almost bewildering in their number through
various eras and cultures in a multi-dimensional, quasi-fractal
continuum. Thus, language may turn out to be something we have
created not as a mere generation or nation, not even as a species,
but in Von Baer's sense as an entire evolutionary phylogeny. It is
this greater configuration which may transcend the more primitive
side of language and eventually provide a more complete image of its
nature, perhaps even shedding light as well on the nature of human
knowledge itself.
In the face of this imposing prospect, it is not surprising that
MT advocates almost invariably focus on that part of language
devoted to "verbal meaning." But I have listed elsewhere no less
than five other common functions of language, almost none of them
totally devoted to the communication of verbal meaning. They are as
follows:
1. Demonstrating one's class status to the person one is
speaking or writing to.
2. Simply venting one's emotions, with no real communication
intended.
3. Establishing non-hostile intent with strangers, or simply
passing time with them.
4. Telling jokes.
5. Engaging in non-communication by intentional or accidental
ambiguity, sometimes also called `telling lies.'
6, 7, 8, etc. Two or more of the above (including
communication) at once. (2)
It should be obvious that most of the foregoing conform at least
as well to the model of "spraying one's surroundings" as they do to
communicating verbal meaning as such. It is hard to see how MT can
ever hope to cope with these larger problems, and it is not
surprising that we have recently seen various limitations arise
connected with launching, marketing and publicizing commercial MT
products as well as with training translators to deal with MT output
as post- editors.(3)
Under no circumstances is this "spraying" metaphor being
presented as a total account of language. This aspect is considered
quite brieflyamong many other intellectually more respectable
analogies for languagein the forthcoming ATA Scholarly Volume on
Terminology, and the author hopes to provide an even more rounded
account in a work still being completed. It does seem important,
however, that some relatively primitivist footnote to the origins of
language should be introduced into discussions about linguistics and
its applications, MT among them. Much writing about languagesince
it is scarcely uneducated people who write about this subject to
begin withtends to luxuriate in self-importance and
self-congratulation about how important a development language has
been for humanity. But the rational and intellectual aspects of
language are in a sense only the most obvious ones, which may have
led MT advocates, perhaps following Chomsky, to suppose language
possesses a logical substructure it may in many cases actually
lack.
Contrasted with these more complex aspects of language, a good
computer program should be a model of simplicity. It should solve
its problem in the most elegant way andas though following the
thread of Ariadneit should go directly to its goal and craftily
find its way out of the labyrinth again, easily slaying or avoiding
all minotaurs and monsters along the way and using its thread as a
guide rather than tripping over it as an obstacle. If it must double
back occasionally in its path, there are good and cogent rules for
not letting this prove a distraction. It is thus not surprising that
the labyrinth or maze is an image that finds instinctive resonance
among hackers, nor that they take delight in playing games where
monsters must be slain.
But what computer rules will guide us through the labyrinth of
language? There is no one entrance or exit and no definable center.
We have all had to learn this labyrinth step by step simply to come
as far as we have. We have even learned about the computerup to a
fairly advanced pointmainly by using language. When we try to solve
the problems of language, whether by building MT programs or
Voice-Writers or other Natural Language applications, we suddenly
find there are monsters everywhere, and it is they who slay us,
rather than the reverse. The technique for slaying one language
monster may allow another to triumph. And the thread itself no
longer traces a brief or elegant path, it has in fact become endless
in its back-trackings and recrossings, creating a whole new jungle
of Koenigsberg Bridges, Towers of Hanoi, Traveling Salesman's
Problems, and other computer math anomalies. Worst of all, the
labyrinth of language is not some separate location we can visit at
our convenience and slowly come to know. Rather, we have no choice
but to live in it constantly. We have never lived anywhere else.
Perhaps it is time to glance backwards from a systems perspective
and see how well language has conformed to our nine-point profile
for a technology. Clearly no survey of need or technological
feasibility can have taken place in the conventional sense. Nor was
financing or research and development a major factor, since a whole
succession of species was available as a free laboratory over
several million years. But at the right time, language came to be
installed in the entire human race, at first only spoken but finally
written as well. It was clearly a technological advance, since it
made it possible for humans, even in its oral form, to exchange more
complex observations and measurements than could be passed along
without it. Perhaps most impressive of all, language now has a total
installed base of over five billion living systems, something no
computer can remotely match, and is still expanding. Its one main
drawback as a technology may lie in the huge service and
administrative staff of teachers, writers, editors, and critics
needed to maintain it, though a comparable problem is not unknown
with computers.
At computer conferences one frequently hears programmers and
other specialists complaining about natural language and boasting
about how they live in a purer, more perfect sphere, in a truer
reality, whether virtual or otherwise. One day they will supplant
all the confusing skeins of messy reality and even messier language
with a finer, higher, texture of purest logic, and all the world
will instantly evolve to the next more transcendent stage. Those who
voice these boasts have but a single problem: for the time being at
least, they are forced to express their vision in precisely the
natural language they claim to despise. To perfect MT or any natural
language application, there is no escaping the fact that it will be
necessary to build a language both higher and lower, in human and
computer terms respectively, than the one we now use, a true
metalanguage. There is room for a great deal of skepticism as to
whether this is possible.
I am not so sanguine as to hope that the foregoing will have any
effect at all on MT zealots, Hal AI acolytes, or dedicated
programmers.(4) Like heroes of old intent on slaying the foe at any
cost, they pay heed only to news of the latest new weapon alleged to
have power against the minotaur. It may be called Corpus-Based MT,
or Neural Nets, or Hidden Markov Models, or Three-Dimensional Fuzzy
Logic, or perhaps it may hinge on creating a neurological interface
with the brain itself. Or it may simply be a matter of time after
all, when computers become sufficiently large and inexpensive,
nothing will be beyond their power, or so goes the tale. But without
a complete algorithm for handling language and linguistic problems,
not all the power in the universe can withstand the might of the
great God GIGO: Garbage In, Garbage Out.
Some of these approaches may bring some advances to some aspects
of MT. But programmers, AI enthusiasts, and MT researchers alike
would do well to realize that they too live in the labyrinth of
language, a realm whose navigational problems have long been
underestimated.
NOTES:
(1) This resemblance extends even to the etymology of the two
words, speech and spray, which are closely related in the Indo-
European family, as are a variety of words beginning with "spr-" or
"sp-" related to spraying and spreading: Engish/German spread,
sprawl, spray, sprinkle, sp(r)eak, spit, spurt,spout, Spreu,
spritzen, Sprudel, Spucke, spruehen, sprechen, Dutch
spreken, Italian sprazzo, spruzzo, Latin,
spargo, Ancient Greek spendo, speiro, etc. The
presence of the mouth radical in the Chinese characters for "spurt,"
"spit," "language," and "speak" also shows how related these
concepts are on a cross-cultural level.
(2) From the author's The Limitations of Computers As
Translation Tools, a chapter from Computers in Translation:
A Practical Appraisal, edited by John Newton, Routledge,
London, 1992.
(3) Peter Wheeler: On Using Professional Translators to
Post-Edit, pp. 353-59, Looking Ahead, Proceedings of the
31st Annual Conference of the American Translators Association,
Edited by A. Leslie Willson, Learned Information, Inc, 1990.
(4) I wish there were some way both programmers and translators
could become aware of their many similarities. Both work at
extremely demanding intellectual tasks requiring a high level of
familiarity with specialized knowledge. Both tend to live somewhat
solitary lives, punctuated by moments of self-indulgence. Both are
beset by constant deadlines, and both are reputed to be something of
drones. While the programmer often purports to despise language and
sees himself as living in "Cyberspace," the translator may feel
hostile towards computer logic while setting up an almost mystical
relationship with his dictionaries and envisioning himself as
dwelling in a realm where reality and meaning meet. Perhaps both are
mistaken in somewhat similar ways.
COPYRIGHT STATEMENT: This article is Copyright ©
1993 by Alexander Gross. It may be reproduced for individuals and
for educational purposes only. It may not be used for any commercial
(i.e., money-making) purpose without written permission from the
author.
Reprinted with the prermission of the
author.
|