Open Source Software For Transcribing Speech in Audio Files

只愿长相守 提交于 2019-12-02 20:36:26

Why can't it read a wav?

It tells you that the file has wrong sampling rate (8000) instead of requested (16000). Sampling rate is very important for speech recognition software.

Why can't it read /dev/dsp?

In recent versions of Ubuntu pulseaudio framework is used instead of OSS. The version you are trying is using OSS so you need to install oss-compatibility package from your distribution to bring OSS support back.

You can try newer Julius which has pulseaudio support

Why does it then appear to be able to read /dev/dsp, but not react in any way?

Audio input doesn't work properly.

Has anyone else had any success with open source speech recognizers, especially on Linux?

Sure, check this video as an example of what people do with CMUSphinx:

http://www.youtube.com/watch?v=vfaNLIowSyk

I suggest you to revisit CMUSphinx package which is a leading open source speech recognition engine. There are loads of documents on the website, you just need to read them. Remember that speech recognition is a complex area where you can get a great results but you also need to invest your time in understanding the technology. Just like with any other domain.

In short, to transcribe a file with CMUSPhinx you need to do the following 3 simple steps:

  1. Take wav file and resample it to 8khz 16 bit mono file with sox:
    sox input.wav -r 8000 -c 1 resampled.wav
  1. Install pocketsphinx 0.7
   apt-get install pocketsphinx
  1. Decode the file
    pocketsphinx_continuous -samprate 8000 -infile resampled.wav

The result will be printed to standard output. To supress the logger, add stderr redirection to /dev/null

    pocketsphinx_continuous -infile resampled.wav 2> /dev/null
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!