I do interview a lot of people who I end up writing about. But between the meeting and writing is a lot of transcription of long interviews. I have always wanted technology to be able to help me there. A few years ago, I discovered a wonderful paid app called SoundNote that let me record the audio and take notes along with it. The beauty of this simple app was that the notes were linked to the audio, and as I clicked on a line, it would show me what was being said at that exact point. While this helped me organise my notes, it also pushed me into the habit of jotting down just keywords during interviews. This meant there was more transcription staring at me. Over the years, I tried a lot of ways to get technology to help me solve this issue. I would switch on the recordings and see if the ‘dictate’ option on Google Docs or Apple Notes would be able to write it down for me. These worked in bits and pieces, but there was never a permanent solution. Even as voice recognition technologies got better and better, this seemed like one area where I could still use some help. So, I was pleasantly surprised over the past couple of weeks when I discovered two apps that make use of the latest technologies to help with voice recognition and transcription. The first is an app called Tetra, which lets you transcribe telephone calls. Yes, you need to make a call via the app, but it will give you a text of what transpired within a few minutes of the call. The other app, Otter Voice Meeting Notes, was suggested by a journalist friend. This app records meetings, or other conversations, and transcribes what was said, also understanding and tagging different voices in the process. Of course, neither of the apps are perfect.
But they can do around 80% of your work, which is good enough. At the moment, stability of your internet connection during the process and the amount of ambient noise during the recording seem to be playing a role in the accuracy of the transcription. Anyway, for me, this is a clear sign that artificial intelligence can actually help do stuff more efficiently. Last week, I also happened to see a demo of US entertainment technology company TiVo’s latest offerings. TiVo’s senior director for international marketing Charles Dawes showed me how their box could now understand voice. But it is no longer about just recognising voice and executing the command. TiVo’s Experience 4 software, which the company is pitching to partners in India, can understand the meaning and context of the command. For instance, using the right metadata along with machine learning powered by artificial intelligence, the box understands that when I say ‘Tom Cruise,’ it looks for content related to the Hollywood star across all sources. Then when I say ‘Nicole Kidman,’ it starts looking for content where both the stars are featured instead of switching to just Kidman movies. A lot of similar context setting is now visible even in how Amazon’s Alexa works in India, where we have a different way of expressing things in English. As luck would have it, minutes after the TiVo demo, I got to meet Carrie Lazorchak and Jason Stirling of Nuance Communications, which has been at the forefront of voice and natural language understanding technologies for many years now. Stirling stressed on the impact of voice technologies in a country like India where, because of illiteracy and limited reach of English, this medium gives more access to millions of new adopters of technologies like smartphones. “Language modelling is a daily game. The more we get, the better we get at it,” explains Stirling, adding how if we apply meaning extraction with natural language processing, the overall system performance goes up many notches. Lazorchak chips in that the challenges of diversity posed by a country like India necessitates local partnerships. “Our technology performs the best when augmented by local businesses that really understand the culture.” She is convinced that regional languages will be a huge usage space especially in the Indian market with higher adoption. “It will bring a lot of services into this space, especially in rural areas.”