MS SAPI SpeechRecognitionEngine in C# completely wrong transcription

问题

I'm new to MS SAPI and I'm trying to write a WAV to TXT conversion utility in C#/Windows Forms using SpeechRecognitionEngine class. I've noticed the speech is completely incorrect. The words don't even sound similar. I'm guessing this could be influenced by a long list of factors, such as sound quality of the input WAV file and the grammar loaded into the recognition engine. I am using the DictationGrammar class.

I'd appreciate any leads from seasoned speech recognition/digital signal processing folks out there.

回答1:

There are a few reasons you may be having such disappointing results. First, if you are using a desktop recognizer, you should train it for the speaker.

A second idea is that if you are converting from a Wav file you must use care when choosing the format of the that file. You may have to resample the wav files because the speech recognition engines only support certain sample rates.

8 bits per sample
single channel mono
22,050 samples per second
PCM encoding

works well on Windows. See https://stackoverflow.com/a/6203533/90236 for some more info.

来源：https://stackoverflow.com/questions/9449348/ms-sapi-speechrecognitionengine-in-c-sharp-completely-wrong-transcription

标签

speech-recognition

sapi

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!