Restricting speech recognition results on Android

前端 未结 3 605
攒了一身酷
攒了一身酷 2020-12-10 20:41

I\'m making an app that allows people to speak and select between a few options (Strings). I\'m having a little problem making the Android Speech Recognizer fit my idea.

相关标签:
3条回答
  • 2020-12-10 21:29

    No, you cannot pass parameters that restrict the recognition or help it make the best match. You have to implement that yourself.

    What you want to do is use some algorithms to help you match what Android's Speech recognizer returns with your desired options. This is especially important when your app has to recognize words that Android's recognizer cannot recognize, like Cumin.

    For this you can use phonetic matching algorithms like the ones here

    For some implementations and sample code on Android check out this open source project: GAST.

    0 讨论(0)
  • 2020-12-10 21:42

    Our solution to this problem is described at http://kaljurand.github.io/Grammars/, e.g. check out the paper linked from this page:

    Kaarel Kaljurand, Tanel Alumäe. Controlled Natural Language in Speech Recognition Based User Interfaces (CNL 2012)

    The basic idea is:

    1. don't use Google's speech recognizer because you cannot (currently) pass the language model (e.g. a grammar) to it (in our case it also didn't support the input language that we wanted to use);
    2. so you need to implement your own speech recognizer (e.g. based on Sphinx) and make it accept grammars as part of the input;
    3. implement the grammar. If it's a simple list of acceptable phrases then JSGF will do as the grammar description language, for more complex grammars I recommend Grammatical Framework (which you can automatically compile to JSGF or finite-state automata);
    4. implement an Android app that extends the RecognizerIntent API by adding a way to pass the grammar to the recognizer. You can base it e.g. on Kõnele.

    All this might be an overkill in your case. Post-processing of Google's results (as @gregm suggests) is certainly easier to implement. But if you want to scale to more complex and/or multilingual language models then our approach certainly provides the required modularity and expressive power.

    0 讨论(0)
  • 2020-12-10 21:42

    No, there are no such parameters, google speech recognition is not flexible enough. You can use external speech recognition toolkit like CMUSphinx

    0 讨论(0)
提交回复
热议问题