Mark’s voice to text – digital voice recorder speech recognition experiment – failure.

I detailed my productivity breakthrough I achieved with a voice to text program that comes with Windows -> Speech recognition. It is basically an automatic transcription program that comes with Windows that frees you from typing.  You can just speak to your computer.

With Microsoft SAPI I trained it now to 99% accuracy. It takes a little patience because when you start you will be at 80% accuracy. You must train it. Just to not try it and say this does not work. It does work.

I was motivated to use voice to text because I do not like sitting in front of the computer all day typing. I speak faster than I type so why not try something to increase productivity. After all computers are tools rather than an alternative lifestyle, hmm right.

Digital recorder speech to text

Thursday, I bought an Olympus 5500 digital recorder. It was about 39 dollars at Auchens (a French version of Walmart).

I used my Sony microphone and sat on my sofa recording a series of sample sentences. I next played this to my computer. It was about about 95% accurate when it came to transcribing it to text.

I was happy with this as the convenience more than compensated for the few errors. Further, I thought I could train it to understand the digital recorder.

I was elated and thought I had a achieved a real break though in productivity that would allow be to blog with ease, even on hikes in the mountains.  I even boasted about my success on one of the forums I hang out on.  I felt like a scientist.

I went to bed pretty pleased with myself. The next day I did the same, this time in different rooms in the house. I created a few blog post on more complex topics I was thinking about in my head about language learning.

Success with voice to text – failure with digital recorder to text

When I spoke directly to my computer it was about 99% accurate, but with the digital recorder only about 85% on complex subjects. I tested and retested.  I concluded my initial tests the day before, were with sentences that were too simple. Voice to text works, but the quality of the digital recorder playback did not allow for complete victory.

Short of retraining SAPI all over again – I gave up on the idea of a digital voice recorder to text and went back to voice to text via talking directly to my computer.

I returned the digital recorder to the French Walmart and will wait until I can think of a better way. Maybe a direct import from a wave file. I think there are programs that are doing this.

I am testing direct wav file imports with these programs, they are open source:

Sphinx Carnegie Mellon University speech recognition

Julius -  Japanese Voice to text in English

Voxforge -Acoustic model speech recognition

I still think I can do digital recorder to text, however, I think I need to master wave imports to the speech recognition engine.  If I can do this then I will splurge and get a slightly better digital recorder as a reward.  If this is possible, it will be like cold fusion.  I will be in the mountains on some hike and recording blog posts or writing a book.

Tags: