I am trying to run the dialog demo of sphinx 4 pre aplha but it gives errors.
I am creating a live speech application.
I imported the project using maven and
If you modify SpeechSourceProvider
to return a constant microphone reference, it won't try to create multiple microphone references, which is the source of the issue.
public class SpeechSourceProvider {
private static final Microphone mic = new Microphone(16000, 16, true, false);
Microphone getMicrophone() {
return mic;
}
}
The problem here is that you don't want multiple threads trying to access a single resource, but for the demo, the recognizers are stopped and started as needed so that they aren't all competing for the microphone.
As Nickolay explains in the source forge forum (here) the microphone resource needs to be released by the recognizer currently using it for another recognizer to be able to use the microphone. While the API is being fixed, I made the following changes to certain classes in the sphinx API as a temporary workaround. This is probably not the best solution, guess until a better solution is proposed, this will work.
I created a class named MicrophoneExtention
with the same source code as the Microphone
class, and added the following methods:
public void closeLine(){ line.close(); }
Similarly a LiveSpeechRecognizerExtention
class with the source code of LiveSpeechRecognizer
class, and made the following changes:
private final MicroPhoneExtention microphone;
microphone =new MicrophoneExtention(16000, 16, true, false);
public void closeRecognitionLine(){ microphone.closeLine(); }
Finally I edited the main method of the DialogDemo
.
Configuration configuration = new Configuration();
configuration.setAcousticModelPath(ACOUSTIC_MODEL);
configuration.setDictionaryPath(DICTIONARY_PATH);
configuration.setGrammarPath(GRAMMAR_PATH);
configuration.setUseGrammar(true);
configuration.setGrammarName("dialog");
LiveSpeechRecognizerExtention recognizer =
new LiveSpeechRecognizerExtention(configuration);
Recognizer.startRecognition(true);
while (true) {
System.out.println("Choose menu item:");
System.out.println("Example: go to the bank account");
System.out.println("Example: exit the program");
System.out.println("Example: weather forecast");
System.out.println("Example: digits\n");
String utterance = recognizer.getResult().getHypothesis();
if (utterance.startsWith("exit"))
break;
if (utterance.equals("digits")) {
recognizer.stopRecognition();
recognizer.closeRecognitionLine();
configuration.setGrammarName("digits.grxml");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizeDigits(recognizer);
recognizer.closeRecognitionLine();
configuration.setGrammarName("dialog");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizer.startRecognition(true);
}
if (utterance.equals("bank account")) {
recognizer.stopRecognition();
recognizerBankAccount(Recognizer);
recognizer.startRecognition(true);
}
if (utterance.endsWith("weather forecast")) {
recognizer.stopRecognition();
recognizer.closeRecognitionLine();
configuration.setUseGrammar(false);
configuration.setLanguageModelPath(LANGUAGE_MODEL);
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizeWeather(recognizer);
recognizer.closeRecognitionLine();
configuration.setUseGrammar(true);
configuration.setGrammarName("dialog");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizer.startRecognition(true);
}
}
Recognizer.stopRecognition();
and obviously the method signatures in the DialogDemo
needs changing...
hope this helps...
and on a final note, I am not sure if what I did is exactly legal to start with. If i am doing something wrong, please be kind enough to point out my mistakes :D
The answer of aetherwalker worked for me - in more detail I overwrote the following files with my own implementations where I only changed the used SpeechSourceProvider:
First one is the AbstractSpeechRecognizer:
public class MaxAbstractSpeechRecognizer {
protected final Context context;
protected final Recognizer recognizer;
protected ClusteredDensityFileData clusters;
protected final MaxSpeechSourceProvider speechSourceProvider;
/**
* Constructs recognizer object using provided configuration.
* @param configuration initial configuration
* @throws IOException if IO went wrong
*/
public MaxAbstractSpeechRecognizer(Configuration configuration)
throws IOException
{
this(new Context(configuration));
}
protected MaxAbstractSpeechRecognizer(Context context) throws IOException {
this.context = context;
recognizer = context.getInstance(Recognizer.class);
speechSourceProvider = new MaxSpeechSourceProvider();
} .......................
Then the LiveSpeechRecognizer:
public class MaxLiveSpeechRecognizer extends MaxAbstractSpeechRecognizer {
private final Microphone microphone;
/**
* Constructs new live recognition object.
*
* @param configuration common configuration
* @throws IOException if model IO went wrong
*/
public MaxLiveSpeechRecognizer(Configuration configuration) throws IOException
{
super(configuration);
microphone = speechSourceProvider.getMicrophone();
context.getInstance(StreamDataSource.class)
.setInputStream(microphone.getStream());
}......................
And last but not least the SpeechSourceProvider:
import edu.cmu.sphinx.api.Microphone;
public class MaxSpeechSourceProvider {
private static final Microphone mic = new Microphone(16000, 16, true, false);
Microphone getMicrophone() {
return mic;
}
}
For me with this change, the problem was as long as i was staying in a context of cmusphinx it was good the line can be reused many times. But if i begin to reuse the mic for another work (like recording) it was not available!
I see that the stream was open in Microphone class but never close!
So first I change in class Microphone
the following attributes from static to dynamic :
private TargetDataLine line;
private InputStream inputStream;
After i change the method stopRecording for closing stream before line:
/**
* close the stream and line
*/
public void stopRecording() {
if (inputStream != null )
try {
inputStream.close();
} catch (IOException e) {
throw new IllegalStateException(e);
}
line.stop();
}
And now with no more change (class SpeechSourceProvider is original), i can reuse alternatively mic for cmupsphinx and another recording task