It is not easy to bridge the communication gap between a group of small town artisans trying to sell their artifacts and tourists from other countries. In a recent pilot, the solution came from speech technologies quite similar to the one used when we instruct a bank account over the phone or telecheck in a flight or check insurance status.
The artisans could speak to the system in their native language and learn English. It is like teaching conversational English through Hindi and up to nine more regional Indian languages.
Similarly, US military has used systems, where they can speak into microphone attached to the computer. Software translates into a language understood by Iraqi soldiers and civilians.
?We are seeking to provide rural India social and financial based services through our speech technology embedded in cell phones,? says Sunny Rao, managing director, Nuance India. More common applications include automobiles, call centers, medical systems, and transcription services. While estimates for speech technology market are not readily available, medical transcription alone is estimated to be worth $7 to $ 10 billion. This involves the automation of the medical transcription companies.
Most transcriptions have traditionally been done through human transcriptionists employed by hospitals, clinics and medical offices. Benefits of an automated solution are immense, for example, a radiologist at a hospital reviewing an x-ray would traditionally dictate diagnosis into a machine, which would later be manually transcribed by a typist and sent back to the doctor. Nuance?s Dragon speech technology, for instance, is able to transcribe the dictation into a script, which is then sent to Nuance?s call center in India, where it is proofread by a pool of editors to ensure no discrepancies exist and then sent back to the doctor in document format. Call Center market for speech technology is more difficult to estimate since there are many different technologies and services involved. Conservative estimates says that companies spend $100 billion a year on call center solutions and a majority of it is spent on labour and which includes inducing speech technology.
Last year, Nuance alone brought in $300-million revenue from speech automation solutions for call centers, which some analysts say is 60% of the market, then the entire market can be pegged at about $500 million.
Microsoft is believed to hold about 10% of this market.
There are between 1 to 1.5 billion handsets, 50 million cars and 30 million navigation devices. Today, users don?t want to simply use a device anymore, they want the device to understand the user. These are all potential platforms for speech technology. With IVR, devices recognising a users voice or understanding what a user is saying when giving a command is not only useful but it could also change the way we live. Nunace Communications, for instance, claims that their mobile solutions are available on 3 billion devices primarily mobile phones, which carry both their speech technology offering and predictive text. ?The idea is to promote the technology not as a solution but as an experience?. Today?s speech technology replaces the old ?Press 1, 2, 3 IVR technology,? says James Brooks, senior vice-president, Asia Pac, Nuance.
His company boasts of a wide portfolio of products for nearly every niche in speech technologies. It recorded a revenue of $919 million in fiscal 2008 and expects $1.1 billion for 2009.
While large software players do have an interest in speech technology, they also realise that this is a very niche market and not significant enough for them to invest resources needed, according to analysts .
IBM, for instance has a 40-year history and investment in speech technology but has chosen to partner Nuance to embed its research into their products and get into niches that might be too small for IBM. The deal entails 100 Nuance researchers conducting collaborative research with IBM?s research and development team on call center and mobile technology. Nuance, however, holds the responsibility of taking the joint technology developed to market and the company also holds the sole distributorship rights for this technology.
While the partnership with IBM is not exclusive, Nuance claims that IBM does not have a similar agreement with other speech recognition partners. Recogniser 9 is jointly developed with IBM and together both companies are working on developing automotive and mobile speech technology.
Advancement in the technology means enhancing natural language capability. For example, whilebooking a ticket, one would not have to answer specific questions of what, when, where and how many but could simply say, I need a ticket from Delhi to New York on Friday, March 20 on United Airlines for two people and the speech technology would be able to pick the pertinent pieces from spoken language.
Microsoft, the other elephant in the room has its speech technology, which is provided as a free add on built into Windows. Nuance stand-alone desktop Dragon technology competes with this and brings in about $50-million business.
Microsoft?s acquisition of Tell Me has enabled them to foray into the directory assistance market and leverage its search capability an area they have an edge over Nuance. Google Phone Search are also leveraging their capability in search and foraying into speech recognition, which amon local languages supports Hindi and Telegu.
Having captured public attention for long, speech recognition systems seem set to break into the mainstream and catch their wallets too.