The problem of human language

Among their limitations, possibly the most manifest failing of AI systems is with regard to language: quite apart from the already formidable problem of segmenting continuous speech and turning it into words and phrases to be analyzed by computer, the most challenging problem is to make the machine understand what is being said. There is an often quoted example where a machine translates "The spirit is willing but the flesh is weak" into the Russian equivalent of "The vodka is strong but the meat is rotten". This example, dating back to the 1960's or 70's, is probably a myth invented by journalists[1], but the criticism unfortunately still applies to today's natural language processing technology, as you can confirm easily by trying any web-based translation tool.

But still, the very fact that such tools are nowadays readily available on the web is indication that some kind of progress is gradually being made.


The problem is much harder than you think: You are generally not aware of the complex processing your brain does to understand language. In particular, you are not aware that virtually every word you hear in a sentence has multiple meanings, and that the words can often be grouped together in different ways. The reason you don't generally notice this is because when you hear a sentence your brain uses implicit knowledge about the world in order to make sense of the sentence, and so you generally only become conscious of one reasonable interpretation out of many possible ones. Take for example the phrase[2] "Time flies like an arrow". You understand this to mean that time moves quickly just like an arrow does. But you may never have realized that it can in fact also be interpreted to mean:

- you should measure the speed of flying insects like you would measure the speed of an arrow;

- you should measure the speed of flying insects in the way an arrow would time them;

- you should measure the speed of flying insects that are like arrows - i.e. You should time those flies that are like arrows;

- Certain flying insects, namely "time-flies," enjoy arrows (as in "Fruit flies like a banana").

Certain jokes are funny because the normal process of contextual disambiguation fails and gives rise to comical interpretations. In the following newspaper headings[3] there are at least two parsings, one of which is humoristic:

- Kids Make Nutritious Snacks

- Enraged Cow Injures Farmer with Ax

- Queen Mary Gets Bottom Scraped

In addition to lexical and syntactical ambiguity, another problem arises because language is essentially a communicative act, based on the assumption of common knowledge between the speakers, and on the speakers' presuppositions concerning what the other speaker knows. Reference to previously mentioned material through the use of pronouns is something which rarely poses problems to humans, but is exceedingly difficult for computers.

- We gave the monkeys the bananas because they were hungry

- We gave the monkeys the bananas because they were over-ripe

have the same grammatical structure on the surface. However, in one of them the word "they" refers to the monkeys, in the other it refers to the bananas. A computer can only figure this out if it knows something about monkeys and bananas.

A similar thing happens in understanding the sentences:

- John saw the bird flying over the mountain

- John saw the stewardess flying over the mountain

Here, in the first example, the bird was flying, whereas in the second, John was doing the flying (in an airplane).

Metaphor and analogy are also difficult things for computers, which have problems dealing with anything more than the literal meaning of phrases like:

- The early bird gets the worm

where for true understanding, reference to shared knowledge about birds and worms and the natural human condition of laziness is necessary. How could a robot ever hope to understand what's funny about the following sentences unless it knows about the human condition?

- There are 3 kinds of people: those who can count and those who can't.

- All I ask is a chance to prove money can't make me happy.

And what about:

- The early bird may get the worm, but the second mouse gets the cheese.

Here reference to shared knowledge is not just about the world, but about things that are often said.

[1] cf John Hutchins, "The whisky was invisible", or Persistent myths of MT. MT News International 11 (June 1995), pp.17-18.

[2] Example taken from wikipedia article on natural language processing.

[3] see among many sites where a wealth of such examples can be found.