Do as I say

Has Siri started a never-ending conversation between humans and technology, asks Rachel Agius.

The human mind is a wonderful thing. It’s the infinite potential to learn, create, process and analyse, sitting in less than 2,000 cubic centimetres. That’s about the volume of a large bottle of water.

Already there is talk of Siri being loaded onto other devices like Apple’s iPod and iPad and will, presumably, go from the status of a novelty function to industry standard

Naturally, all these skills would be worth nothing if we could not communicate our thoughts. This is where language comes in and sets us apart from the rest of the animal kingdom.

So it comes as no surprise that the understanding of language has been a goal for software engineers for a long time. In an effort to make communication with technology more natural, what better way to do it than to give that technology the power to understand speech?

The lure of having a machine understand your every word is something sci-fi writers have toyed with for generations. A sentient machine has been the stuff of dreams for some and a nightmare for others.

The most recent splash in the Artificial Intelligence pool has been made by Siri – Apple’s latest brainchild has users talking to their iPhones as they would to a personal assistant. The promotional videos paint a picture of seamless communication, a piece of technology that can be told what to do in speech that could just as easily be directed to another human being.

Naturally, parody videos abound, portraying a helpless Siri as a mediator between warring spouses and other such amusing situations. But Natural Language Processing, as this ability is known among the AI crowd, has been around longer than Apple’s disturbingly effective ad campaigns would have you think.

NLP involves a crossing of paths – linguistics on the one hand and AI on the other. The human brain’s capacity to understand not only words but their meaning and subtle connotation is something that has been the subject of study for decades. And we’re still not exactly sure how it happens.

However, there are patterns within this complex process which designers have been able to identify and encode in their software.

Speech recognition has been around for a long time now. Starting off in the 1930s as an AT&T Bell Laboratories experiment, speech recognition went from strength to strength as software became more accurate and registered fewer errors with each progressive development.

The goal was always to have computers understand regular human speech but until recently, the human input was always limited to simplified commands with a limited set of functions and even then there was a high chance of error. Eventually, speech recognition made its way into industry, making the work of telephone operators, medical professionals and others much easier.

Speech recognition, however, is not the same as NLP. When a command is made using a single word or short sentence, which is already encoded within the software, it is relatively easy to get whatever machine you’re talking to to respond.

But normal human conversation is so loaded with meaning, idiosyncrasies and regional variation that it would be impossible to program software to understand each and every combination of words a person can come up with. There simply is too much data.

This is where the processing element comes in and also where an understanding of linguistics becomes a valuable asset. If a particular programme is taught to recognise the patterns of human communication, then the volume and type of speech that the program can process is greatly improved.

Coincidentally, this affinity for pattern is how children learn to speak. By singling out a structure or scheme of speech and then attempting it themselves, children learn, through trial and error, which patterns will achieve the result they want, whether it is asking for food or engaging a playmate.

Before Siri, speech recognition and NLP had been used for practical purposes. But now, with a mobile phone that can reschedule your appointments and get you out of the dodgy part of town without you ever pressing, holding or shouting at anything, the face of technology may be beginning to change. From mindless machines to semi-intelligent personal assistants, we’ve gone from indisputable masters to equals, comrades in the perpetual battle to get things done faster, easier and with the least possible interruption to these pursuits. True, many mobile phones have had speech recognition functions for a while now but Siri’s breakthrough comes in the fact that you don’t have to slowly and carefully articulate every syllable at a deafening volume, which was the common complaint with many mobile phone users who would rather just use the keypad to make a call rather than announce it to most of the continent.

As is usually the case, one software innovation is bound to spawn others, which will match or even surpass the first. Already there is talk of Siri being loaded onto other devices like Apple’s iPod and iPad and will, presumably, go from the status of a novelty function to industry standard.

One might wonder when the day will come when technology will outgrow us. It already has a language of its own and is now starting to understand our complex dialect. Again, the world of science fiction has offered many a possible ending to this story, ranging from the fight for robot equality to the decimation of the entire human race. In the mean time, we’ll just enjoy something that everyone secretly relishes – quite literally, telling someone, or something, what to do.

Ms Agius is interested in all things technological and blogs at www.eweandme.blogspot.com

Do as I say

Sign up to our free newsletters