Analyzing An Ataxic Dysarthria Patient's Speech with Computer Vision and Audio Processing
Hey everyone, so as you know I have been doing research on patients like myself who have Ataxic Dysarthria and other neurological speech disorders related to diseases and conditions that affect the brain. I was analyzing this file
with a few programs that I have written.
The findings are very informative and I am excited that I am able to explain this to my Tumblr following as I feel it not only promotes awareness but provides an understanding of what we go through with Ataxic Dysarthria.
Analysis of the audio file with an Intonation Visualizer I built
As you can tell this uses a heatmap to visualize loudness and softness of a speaker's voice. I used it to analyze the file and I found some really interesting and telling signs of Ataxic Dysarthria
At 0-1 seconds it is mostly pretty quiet (which is normal because it is harder for patients with AD to start their speaking off. You can notice that around 1-3 seconds it gets louder, and then when she speaks its clearer and louder than the patients voice. However the AD makes the patients speech constantly rise and fall in loudness from around -3 to 0 decibels most of the audio when the patient is speaking. The variation though between 0 and -3 varies quickly though which is a common characteristic in AD
The combination of the constant rising and falling in loudness and intonation as well as problems getting sentences started is one of the things that makes it so hard for people to understand those with Ataxic Dysarthria.
The second method I used is using a line graph (plotted) that gives an example of the rate of speech and elongated syllables of the patient.
As you can see I primarily used the Google Speech Recognition library to transcribe and count the syllables using Pyphen via "hyphenated" (elongated) words in the speech of the patient. This isn't the most effective method but it worked well for this example and here is the results plotted out using Matplotlib:
As you can see when they started talking at first there was a rise from the softer speech, as the voice of the patient got louder, they were speaking faster (common for those with AD / and HD) my hypothesis (and personal experience) is that this is how we try to get our words out where we can be understood by "forcing" out words resulting in a rise and fall of syllables / rate of speech that we see at the first part. The other spikes typically happen when she speaks but there is another spike at the end which you can see as well when the patient tries to force more words out.
This research already indicates a pretty clear pattern what is going on in the patients speech. As they try to force out words, their speech gets faster and thus gets louder as they try to communicate.
I hope this has been informative for those who don't know much about speech pathology or neurological diseases. I know it's already showing a lot of exciting progress and I am continuing to develop scripts to further research on this subject so maybe we can all understand neurological speech disorders better.
As I said, I will be posting my research and findings as I go. Thank you for following me and keeping up with my posts!