Speech Recognition in Medicine:
Diagnosis and Treatment
Speech recognition is now being used to classify, diagnose, and treat conditions. Speech recognition technology is also used as a tool to increase productivity in word processing applications.
A study in 2010 used the app EmotionSense to predict emotions with 70% accuracy using voice recordings, movement and location through GPS. You can see the app's graphical user interface (GUI) closeby. Another study in 2011 detected emotions and analyzed stress with 93.6% accuracy by analyzing pitch and information about glottal vibrational cycles. This study used the AMMON (Affective and Mentalhealth MONitor library) and Support Vector Machine (SVM), a widely used supervised machine learning model for emotion recognition. |
Dyslexia
|
In a study in 2004, speech recognition dictation with less than 10% error helped students with learning disabilities (LD) significantly, while there was no change for the controls, indicating a direct benefit to those with LD.
One commonly used speech recognition software is Nuance Dragon NaturallySpeaking ($90). This is a popular Windows-based speech-to-text software that has been around since 1997, performs with 99% accuracy and allows users to finish dictating three times faster than typing. The macOS version is called Dragon Professional Individual for Mac. This video demonstrates this software's benefits for those with dyslexia in school. It benefits many others including the visually impaired and those with fluent aphasia. |
Depression
A cross-cultural study from 2016 was able to predict severe depression in Australian, American and German participants with up to 97% accuracy using verbal biomarkers. Additionally, groups such as Cogito Health that came out of MIT are developing software with funding from DARPA to analyze voice tone and pitch, length and frequency of pauses and speed of speech in order to screen for depression over the phone. The image demonstrates mathematical models they are working on, with the last graph showing the confidence level in determining depression. This software was released for public use in 2012.
|
Psychosis
|
A longitudinal study from 2015 found that psychosis could also be predicted with 100% accuracy over 2.5 years by analyzing two syntactic markers of speech complexity: maximum phrase length and use of determiners (e.g., which). These features predicted future psychosis development better than clinical interview classification. This video has Mariano Sigman, one of the researchers of this study, describing speech recognition's role in mental health prediction, and how this study contributes to this goal.
|
Parkinson's
Parkinson's Voice Initiative in collaboration with the University of Rochester Medical Center have designed a voice-based test on Android that is high speed (less than 30 seconds), low cost, and as accurate as clinical objective symptom tests. Since voice is affected as much as limb movements by Parkinson's, scientists are able to analyze sustained phonations from voice recordings to detect Parkinson's disease, and even predict the severity of symptoms with 99% accuracy. University of Oxford's Max Little, a mathematician involved in this project explains to BBC, "We're not intending this to be a replacement for clinical experts, rather, it can very cheaply help identify people who might be at high risk of having the disease and for those with the disease, it can augment treatment decisions by providing data about how symptoms are changing in-between check-ups with the neurologist."
|
|
Aphasia
|
Speech recognition is often used in speech therapy. Four studies have found that a speech therapy software called Parrot Software successfully helps treat aphasia, an impairment of speech often caused by brain injuries such as stroke, trauma, tumors, or infections.
A study from 2013 also showed that primary progressive aphasia and its subtypes can be diagnosed successfully using automatic speech recognition (specifically Nuance Dragon NaturallySpeaking), even with noisy recordings. However, many say that speech recognition still has a long way to go based on the lower-than-desired accuracy of diagnosis discussed in many reviews such as this one. It is probable that this particular condition can make speech recognition a challenge depending on the severity of the language impairment. |