Common Sense: From lips and text to speech now from thought to speech in AI

Thursday, May 04, 2023

From lips and text to speech now from thought to speech in AI

About 20 years ago or so major efforts were expended on reading lips from videos and match it to speech. Then came text-to-speech and became very successful!

This is very early research!

Beware of the usual alarmism and hysteria regarding e.g. mind readers!

"The little voice inside your head can now be decoded by a brain scanner — at least some of the time. Researchers have developed the first non-invasive method of determining the gist of imagined speech, presenting a possible communication outlet for people who cannot talk. But how close is the technology — which is currently only moderately accurate — to achieving true mind-reading? ...

Most existing thought-to-speech technologies use brain implants that monitor activity in a person’s motor cortex and predict the words that the lips are trying to form. ... computer scientists ... combined functional magnetic resonance imaging (fMRI) ... with artificial intelligence (AI) algorithms called large language models (LLMs), which underlie tools such as ChatGPT and are trained to predict the next word in a piece of text. ...
But many of the sentences it produced were inaccurate. The researchers also found that it was easy to trick the technology. ..."

From the abstract:

"A brain–computer interface that decodes continuous language from non-invasive recordings would have many scientific and practical applications. Currently, however, non-invasive language decoders can only identify stimuli from among a small set of words or phrases. Here we introduce a non-invasive decoder that reconstructs continuous language from cortical semantic representations recorded using functional magnetic resonance imaging (fMRI). Given novel brain recordings, this decoder generates intelligible word sequences that recover the meaning of perceived speech, imagined speech and even silent videos, demonstrating that a single decoder can be applied to a range of tasks. We tested the decoder across cortex and found that continuous language can be separately decoded from multiple regions. As brain–computer interfaces should respect mental privacy, we tested whether successful decoding requires subject cooperation and found that subject cooperation is required both to train and to apply the decoder. Our findings demonstrate the viability of non-invasive language brain–computer interfaces."

Mind-reading machines are here: is it time to worry? Neuroethicists are split on whether a study that uses brain scans and AI to decode imagined speech poses a threat to mental privacy.

Semantic reconstruction of continuous language from non-invasive brain recordings (no public access)

Thursday, May 04, 2023

From lips and text to speech now from thought to speech in AI

No comments: