The breakthrough, which harnesses the power of speech synthesisers and artificial intelligence, could lead to new ways for computers to communicate directly with the brain.
In a first, scientists have created a system that translates thoughts into intelligible, recognisable speech, an advance that may help people who cannot speak regain their ability to communicate with the outside world. By monitoring someone’s brain activity, the technology developed by researchers from Columbia University in the US can reconstruct the words a person hears with unprecedented clarity. The breakthrough, which harnesses the power of speech synthesisers and artificial intelligence, could lead to new ways for computers to communicate directly with the brain. It also lays the groundwork for helping people who cannot speak, such as those living with as amyotrophic lateral sclerosis (ALS) or recovering from stroke, regain their ability to communicate with the outside world, researchers said.
“Our voices help connect us to our friends, family and the world around us, which is why losing the power of one’s voice due to injury or disease is so devastating,” said Nima Mesgarani, of Columbia University in the US. “With today’s study, we have a potential way to restore that power. We’ve shown that, with the right technology, these people’s thoughts could be decoded and understood by any listener,” said Mesgarani, a principal investigator of the study published in the journal Scientific Reports.
Decades of research has shown that when people speak — or even imagine speaking — telltale patterns of activity appear in their brain. Distinct pattern of signals also emerge when we listen to someone speak, or imagine listening. Experts, trying to record and decode these patterns, see a future in which thoughts need not remain hidden inside the brain — but instead could be translated into verbal speech at will. However, accomplishing this feat has proven challenging. Early efforts to decode brain signals by researchers focused on simple computer models that analysed spectrograms, which are visual representations of sound frequencies.
However, because this approach has failed to produce anything resembling intelligible speech, the team turned instead to a vocoder, a computer algorithm that can synthesise speech after being trained on recordings of people talking. “This is the same technology used by Amazon Echo and Apple Siri to give verbal responses to our questions,” said Mesgarani. Researchers plan to test more complicated words and sentences next, and they want to run the same tests on brain signals emitted when a person speaks or imagines speaking. Ultimately, they hope their system could be part of an implant, similar to those worn by some epilepsy patients, that translates the wearer’s thoughts directly into words.