CMUSphinx PocketSphinx - Recognize all (or large amount) of words

坚强是说给别人听的谎言 提交于 2019-12-17 03:40:20

问题


Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Now, In PocketSphinx, I need to do it. But I can only find how to set recognition for one word, Or to set dictionary (The ones available in the demo project have only few words) that the recognizer think these are the only words exist, Which means that if someone says something similar, The recognizer thinks its the word that listed in the dictionary.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?


回答1:


Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Google API recognizes a large but still limited set of words too. For a long time it failed to recognize "Spotify". Google offline speech recognizer uses about 50k words as described in their publication.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

Demo includes large vocabulary speech recognition with a language model (forecast part). There are bigger language model for the English language available for download, for example En-US generic language model.

The simple code to run the recognition is like that:

 recognizer = defaultSetup()
   .setAcousticModel(new File(assetsDir, "en-us-ptm"))
   .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
   .getRecognizer();
  recognizer.addListener(this);

  // Create keyword-activation search.
  recognizer.addNgramSearch(NGRAM_SEARCH, new File(assetsDir, "en-us.lm.bin"););

  // Start the search
  recognizer.startListening(NGRAM_SEARCH);

However, they are not easy to fit into device and decode in realtime. If you want to decode speech in realtime with large vocabulary you need to stream audio to a server. Or you need to restrict the vocabulary and language to some small subset of generic English. You can learn more about speech recognition in CMUSphinx in tutorial.




回答2:


Update, in 2019 I recommend everyone to try Kaldi library on Android. You can find the demo here. It is actually a large vocabulary speech recognizer running in realtime (70k words in LM).



来源:https://stackoverflow.com/questions/25949295/cmusphinx-pocketsphinx-recognize-all-or-large-amount-of-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!