Through the use of cutting-edge technology, a woman with severe paralysis can now talk by utilising an avatar. This innovation involves converting her brain signals into both speech and facial gestures.
This breakthrough brings forward the hope that brain-computer interfaces (BCIs) are on the verge of revolutionising the lives of individuals who have lost their ability to speak due to conditions like strokes and ALS.
In contrast to the past, where patients had to rely on sluggish speech synthesizers that required them to spell out words via eye tracking or minor facial movements, this advanced method relies on minuscule brain-surface electrodes.
How AI Tech Can Help People With ALS Speak
As per the research report, the brain-surface electrodes detect electrical activity in the brain region responsible for speech and facial expressions. The gathered signals are then directly translated into the speech and facial reactions of a digital avatar, encompassing a range of emotions from smiles and frowns to expressions of surprise.
"Our aim is to reinstate a complete and embodied form of communication, that truly mirrors the most instinctive way for us to engage with others," remarked Professor Edward Chang, who spearheaded the research at the University of California, San Francisco (UCSF). "These advancements bring us significantly closer to transforming this into a tangible solution for patients."
About The Patient
The patient, Ann, a 47-year-old woman, has grappled with severe paralysis for over 18 years following a brainstem stroke. She lacks the ability to speak or type, relying on movement-tracking technology to painstakingly select letters, achieving a rate of up to 14 words per minute. She holds aspirations that the avatar technology could potentially allow her to work as a counsellor in the future.
The research team affixed a slim, paper-like rectangle comprising 253 electrodes onto Ann's brain surface, specifically over a region crucial for speech. These electrodes intercepted the brain signals that would have governed her tongue, jaw, larynx, and facial muscles had it not been for the stroke.
After the implantation, Ann collaborated with the team to train the AI algorithm of the system. This involved repetitive recitation of various phrases to enable the AI to discern her distinct brain signals corresponding to different speech sounds.
The AI learned 39 distinct sounds, and a language model akin to ChatGPT was employed to translate these signals into coherent sentences. Subsequently, this output was utilized to control an avatar, complete with a voice customized to replicate Ann's pre-injury speech, based on a recording from her wedding.
The technology wasn't without flaws; during testing, it decoded words incorrectly 28% of the time out of more than 500 phrases. Moreover, it generated brain-to-text at a pace of 78 words per minute, in contrast to the natural conversation rate of 110–150 spoken words per minute.
Despite these limitations, scientists noted that the recent advancements in precision, speed, and complexity suggest that the technology has reached a practical usefulness level for patients.
Suggested Reading: Menstrual Products Are Only Just Being Tested On Blood: What Do We Know