问题
I would like my Android application to do continuous keywords spotting. I'm modifying the pocketsphinx android demo to test how I can do it. I wrote this list in a file named en-keywords.txt picking words from cmudict-en-us.dict:
rainbow /1e-50/
about /1e-50/
blood /1e-50/
energies /1e-50/
In setupRecognizer method I removed every search and added to the recognizer only this keyword search:
File keywords= new File(assetsDir, "en-keywords.txt");
recognizer.addKeywordSearch(KWS_SEARCH, keywords);
Finally I modified onPartialResult like this:
public void onPartialResult(Hypothesis hypothesis) {
if (hypothesis == null)
return;
String text = hypothesis.getHypstr();
switchSearch(KWS_SEARCH);
}
so that every time a partial result is found with a not null hypotesis the onResult is called and the search starts again.
What I see in the app running is not what I'm expecting:
- onPartialResult has a not null hypotesis every time I speak also if I say something very different from what I'm looking for;
- also if I say "hey" onPartialResult hypotesis is often composed by more than one word; worst case I say "hey" and the method understand "rainbow about energies blood"
- onResult method is then called but it prints a Toast with a text different from the last found by onPartialResult; like if it was a concat of strings done in some not trivial order.
I tried with different tresholds for keywords but I didn't find my way... Probably I'm missing some basic concept or some configuration parameter... Can someone help me on this?
回答1:
Definitely the solution is to understand how thresholds work and to tune them correctly. I read from sourceforgeforum that the higher the treshold (max 1) the less false alarm (with the risk of missing true matches) and viceversa (min 1e-50). Pocketsphinx code will use your threshold and return a match if the weight of a possible recognition is greater or equal to your threshold: giving a keyphrase a threshold of 1 means you want to have that keyphrase in the result only if pocketsphinx is absolutely sure of what has been spoken.
I was using 1e-50 which is a very low treshold that leads to a lot of false alarms: with that treshold almost everything you say will be understood as one or more of the keywords in your list. This is the answer to points 1 and 2 in my question.
About my 3rd point the answer is that hypothesis.getHypstr()
in onResult contains a concat of every possible match found. To discern from one match to another by looking at weights it should be possible to iterate over Segments: recognizer.getDecoder().seg()
(see here).
This is not ended up anyway. To implement a well performing recognizer one has to follow some rules in choosing keyphrases and then to perform treshold tuning. Like the CMU tutorial said:
- For the best accuracy it is better to have keyphrase with 3-4 syllables;
- Too short phrases are easily confused.
来源:https://stackoverflow.com/questions/41522815/how-to-setup-tresholds-to-spot-keywords-from-a-list-in-pocketsphinx-android