What makes speech and language interfaces hard to create? Part 3: Language

Introduction

This article follows on from a few about computers, speech and language:

Why are speech and language interfaces useful?
What makes speech and language interfaces hard to create? Part 1: Overview
What makes speech and language interfaces hard to create? Part 2: Speech
What makes speech and language interfaces hard to create? Part 3: Language
When is a speech and language interface a poor choice?

By language I mean text i.e. you don’t have to worry about recognising or synthesising speech, but still have to deal with text as typed in by the user, read from a file etc. This is going to be another article where I go on about how the everyday language tasks you perform, like reading this or writing email, are amazing.

Ambiguity

I have already referred to some aspects of ambiguity in a previous article. One of my favourite kinds of ambiguity, particularly in technical writing, is in compound nouns. This is where two or more nouns are group together to form a single noun, like piston head rocker cap.

The ambiguity comes from the way you group the components together to form the compound. You could group piston and head together, then that with rocker, and finally with cap which could be represented as (((piston head) rocker) cap). This might be the cap for a rocker that’s associated with a piston’s head.

However, you could group the same nouns together in different ways, such as:

(piston (head (rocker cap))) – the most important or highest (i.e. head) rocker cap, that is being used as a piston
((piston head) (rocker cap)) – a piston head that is acting as a rocker cap

The more nouns being grouped together, the more ways there are of doing the grouping. The number of ways grows very quickly with the number of words – the number of ways is a Catalan number.

What’s the same and what’s different?

Think about these words: jump / leap / spring / vault / dive. Are they the same? Are they similar? Could you always use one in the place of the others? If they’re not identical, how are they different?

This is the realm of a thesaurus, but also everyday language. If someone’s searching for leap, do you return results that don’t match leap but do match jump? What if they search for leap year? If they search for spring, do you show matches for jump, for Struts (or other Java frameworks), for escape and so on?

It’s possible to define things in too tight a way, which forces the user to guess exactly the way the computer or the author of e.g. some web site in a search index uses to express something. On the other hand, if things are too loose then you get all kinds of things that the user considers irrelevant, which drown out the useful stuff.

Morphology

Another way in which things can be the same or different relates to morphology. Are the singular and plural versions of a noun the same? How about the different tenses of a verb?

What about words that serve different roles in a sentence but are related such as history / historical / historically? If I search for historical re-enactor, should it also find results that match re-enactor of history?

Pronouns

Pronouns, such as it, they and so on, are like variables in a program that are looking for a value by becoming a reference to something else. Deliberately ambiguous pronouns, i.e. pronouns that could refer to two or more other things, are a bugbear of Cinema Sins videos (hence The Pronoun Game).

In order to find a meaning for a pronoun you need to keep track of the context of the text so far. This might have a nested structure, similar to nested scopes in a program. You can sometimes filter down the list of options by matching up number and gender of things. This spills over into idiom – for instance she might refer to a ship, the Earth etc.

Mistakes

There is more than one kind of mistake someone could make. The first is typos like mis-spelling psychiatrist or receive. These can sometimes be easy to spot – if the error doesn’t push the word all the way to being a different word. They can sometimes be avoided, for instance if the user interface can offer suggestions via auto-complete.

It might not be practicable to have a mapping of mistake to intended word for all mistakes. Instead you might try to find the best match in your dictionary for an arbitrary input word using something like the Levenshtein distance.

The user might correctly spell a word, but pick the wrong one, for instance your / you’re, its / it’s or there / their / they’re. This is an area where you need to make a decision about pedantry. Even though it might be correct to say fewer options, should you accept it when the user says less options? Should a search expression that includes less options find matches for fewer options and vice versa?

Even more variability

Variability is not just word A vs. word B in the same slot, or word A mis-spelled. The same or similar meaning can be contained in very different sentence structures:

Tea, white, no sugar.
Tea, please.
Can I have a cup of tea, please?
I’d like a cup of tea.
I’m gasping for a cuppa.
Is the kettle on?
Can you squeeze the pot?

Anarchy in the UK: Would you like some tea? No. — Image credit

Forcing people to express themselves in one way is like the annoying text-based computer adventure games from the 1980s. Yes it’s harder to implement, but it’s what users expect.

How do you search in Google? Do you ask a fully-formed question? Do you just specify a series of keywords?

Foreign languages

You might not think that this applies to you, but it illustrates some of the other points already mentioned. Different languages group concepts differently, so it isn’t always possible to translate exactly from one language to another. Either the concept doesn’t exist, or it has different associations with other concepts. For instance, languages don’t divide the spectrum up into colours in the same way.

It might appear that you’re free of such brain-bending concepts because you’re dealing with only one language, but one language is a slippery term. You might be tripped up by differences between e.g. American English and British English, or Peninsular Spanish and Spanish of the Americas.

For instance, American English has a plural form of you – y’all – that doesn’t exist in British English. In French and German the plural form of you is also formal (vous / sie) and there’s an informal singular you (tu / du). British English used to make this distinction (thee for informal singular, and you for formal or plural) but has lost it.

If you are interested in this kind of thing, I recommend Le Ton Beau de Marot by Douglas Hofstadter.

Summary

Well done for reading this. In fact, well done for reading at all. You’re amazing. Writing can efficiently convey a single fact, or it can be deliberately opaque or ambiguous to evoke a cluster of emotions. It is an area of self-expression, and where people make mistakes. It’s no wonder that computers find it hard.