ALS Robbed Her of Speech, But Technology Is Changing That
WEDNESDAY, Aug. 23, 2023 (HealthDay News) -- Many people with Lou Gehrig’s disease, also called amyotrophic lateral sclerosis (ALS), first start to lose the ability to move their arms and legs.
That's not Pat Bennett. She can move just fine. She can still dress herself, and she can even use her fingers to type.
But ALS has robbed Bennett, 68, of her ability to speak. She can no longer use the muscles of her lips, tongue, larynx and jaw to make the sounds that add up to speech.
“When you think of ALS, you think of arm and leg impact,” Bennett wrote in an interview conducted by email. “But in a group of ALS patients, it begins with speech difficulties. I am unable to speak.”
New brain-computer interfaces (BCIs) are being developed to restore communication for folks like Bennett, who have been robbed by paralysis of the power of speech.
Two new papers in the scientific journal Nature show how quickly that technology is advancing, based on breakthroughs in software and technology.
Four baby aspirin-sized sensors implanted in Bennett’s brain are now converting her brain waves into words on a computer screen at 62 words per minute -- more than three times faster than the previous record for BCI-assisted communication, Stanford University researchers report.
Meanwhile, another woman who lost her speech to a stroke is now producing nearly 80 words per minute of computer-spoken language, thanks to researchers from the University of California, San Francisco and University of California, Berkeley.
What’s more, the female patient also has a computer avatar that reflects her facial movements as she speaks.
“With these new studies, it is now possible to imagine a future where we can restore fluid conversation to someone with paralysis, enabling them to freely say whatever they want to say with an accuracy high enough to be understood reliably,” said Frank Willett, a staff scientist at Howard Hughes Medical Institute who served as lead researcher for the Stanford study involving Bennett. Willett spoke Tuesday at a news briefing about the two studies.
Both studies involve implanting electrodes that specifically track brain activity related to creating speech, using the facial and voice box muscles that Bennett now cannot control.
Through separate methods, the research teams use computer programs to translate those brain waves into phonemes, the basic building blocks of speech.
For example, the word “hello” contains four phonemes -- “HH,” “AH,” “L” and “OW.”
The focus on phonemes enhances the speed and accuracy of translation software, because the computer only needs to learn 39 phonemes to decipher any word in English, the UCSF and Berkeley researchers noted.
Two patients, two research centers, two successes
“The results from both studies, between 60 to 70 words per minute in both of them, [are] a real milestone for our field in general,” said Dr. Edward Chang, chairman of neurological surgery at UCSF and leader of the study there.
"And we're really excited about it, because it's coming from two different patients, two different centers, two different approaches," Chang said at the briefing. "And the most important message is that there is hope that this is going to continue to improve and provide a solution in the coming years."
In Bennett’s case, researchers transplanted high-resolution microelectrodes that record the activity of single neurons. Surgeons placed on the surface of her brain two electrodes apiece in two separate regions involved in speech production.
“Our current focus is to understand how the brain represents speech at the level of individual brain cells and to translate the signals associated with attempted speech into text or spoken words,” said senior researcher Dr. Jaimie Henderson, the Stanford neurosurgeon who placed Bennett’s implants.
The team then used Bennett’s brain impulses to train translation software to accurately convert her attempted utterances into words on a computer screen.
Bennett participated in about 25 four-hour training sessions, where she attempted to repeat random sentences drawn from sample conversations among people talking on the phone.
Examples included “It’s only been that way in the last five years” and “I left right in the middle of it.”
The translator decoded Bennett’s brain activity into a stream of phonemes, then assembled them into words on a computer screen.
After four months of training, the software became able to convert Bennett’s brain waves at a faster clip than ever before.
Bennett’s 62-words-per-minute (wpm) pace brings BCI communication closer to the roughly 160-wpm rate that occurs during normal conversation between English speakers, Henderson said.
Bennett received her ALS diagnosis in 2012. Living in the San Francisco Bay area, she’s a former human resources director and was once an equestrian and avid jogger.
“Imagine how different conducting everyday activities like shopping, attending appointments, ordering food, going into a bank, talking on a phone, expressing love or appreciation -- even arguing -- will be when nonverbal people can communicate their thoughts in real time,” Bennett wrote.
A voice with her avatar
The UC researchers took the same concept but followed it along slightly different lines.
They placed a larger single brain implant -- a paper-thin rectangle of 253 electrodes -- onto the surface of a female patient’s speech centers. By comparison, Bennett’s four implants were arrays of 64 electrodes arranged in 8-by-8 grid.
A stroke had cost the woman her ability to speak, but the electrodes intercepted the brain signals that would have gone to her face, tongue, jaw and voice box.
The woman then worked with the UC team for weeks to train the system’s speech translator, by repeating different phrases from a 1,024-word conversational vocabulary. As with the Stanford project, this software also focused on translating brain impulses into phonemes.
But instead of words on a screen, the computer synthesized her neural activity into audible speech. What’s more, it was the woman’s own voice emerging from the computer.
“Using a clip from her wedding video, we were able to decode these sounds into a voice that sounded just like her own prior to her stroke,” said Sean Metzger, a bioengineering graduate student at UCSF/UC Berkeley who helped develop the text decoder.
The team also created an animated avatar that would stimulate the muscle movements of the woman’s face as she produced words. Not only did the avatar reflect what was being said, but it also could reproduce facial movements for such emotions as happiness, sadness and surprise.
“Speech isn't just about communicating just words, but also who we are,” Chang said. “Our voice and expressions are part of our identity. So we wanted to embody a prosthetic speech that could make it more natural, fluid and expressive.”
Accuracy and speed is the goal
Both teams found that focusing on phonemes produced amazing results in terms of speed and accuracy.
“We decoded sentences using a vocabulary of over a thousand words with a 25% word error rate at 78 words per minute,” Metzger said. “Offline, we saw that loosening these vocabulary constraints to over 39,000 words barely increased the error rate to 27.8%, showing that in our models phoneme predictions can be reliably connected to form the right words."
Comparably, the Stanford team’s translator had a word error rate of 23% when using a potential vocabulary of 125,000 words, Willett said.
“They're actually very overlapping in the kind of outcomes that were achieved,” Chang said of the two research projects. “We're thrilled that there is this level of mutual validation and accomplishment in our long-term goal to restore communication for people who have lost it through paralysis."
Both teams said they want to continue to refine their individual processes by using more sophisticated translation software and better, more elaborate electrode arrays.
“Right now, we're getting one out of every four words wrong. I hope the next time you talk to us, we're getting maybe one of every 10 words wrong,” Chang said. “One pathway that we're really excited about exploring is just more electrodes. We need more information from the brain. We need a clearer picture of what's happening."
Henderson likened it to the end of the broadcast TV era -- "the old days," if you will.
“We need to continue to increase the resolution to HD and then on to 4K so that we can continue to sharpen the picture and do it better and improve the accuracy,” he said.
The RAND Corporation has more about brain-computer interfaces.
SOURCES: Pat Bennett, ALS patient; Frank Willett, PhD, staff scientist, Howard Hughes Medical Institute, Stanford University, Stanford, Calif.; Jaimie Henderson, MD, professor, neurosurgery, Stanford University; Edward Chang, MD, chair, neurological surgery, University of California, San Francisco; Sean Metzger, MS, graduate student, joint Bioengineering Program, University of California, San Francisco and University of California, Berkeley; Nature, Aug. 23, 2023