Android speech recognizing and audio recording in the same time

后端未结

关注

 5  1311

My application records audio using MediaRecorder class in AsyncTask and also use Google API transform speech to text - Recognizer Intent - using the code from this question

相关标签:

5条回答

暗喜

2020-12-02 17:50

I have successfully accomplished this with the help of CLOUD SPEECH API. You can find it's demo by google speech.

The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone, enable command-and-control through voice, or transcribe audio files, among many other use cases. Recognize audio uploaded in the request, and integrate with your audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products.

It uses audio buffer to transcribe data with help of Google Speech API. I have used this buffer to store Audio recording with help of AudioRecorder.

So with this demo we can transcribe user's speech parallely with Audio Recording.

In this, it starts and stops speech recognition based on voice. It also gives a facility of SPEECH_TIMEOUT_MILLIS in VoiceRecorder.java which is just same as EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS of RecognizerIntent, but user controlled.

So all in all, you can specify silence timeout and based on that it will stop after user output and start again as soon as user starts speaking.

0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2020-12-02 17:52

Recent projects on 'google-speech' and on 'android-opus' (opuslib) allow simple, concurrent recognition along with audio record to an opus file in android ext. storage.

Looking at the VoiceRecorder in the speech project , with only a few extra lines of code after reading the microphone buffer, the buffer can also be consumed by a fileSink (PCM16 to Opus-codec) in addition to the current speech-observer.

see minimal merge of the 2 projects above in Google-speech-opus-recorder

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2020-12-02 17:58

I haven't tested this solution yet but maybe there is a possibility. In http://developer.android.com/reference/android/speech/RecognitionService.Callback.html there is method void bufferReceived(byte[] buffer). The possible solution is to saving this recived buffer in AudioRecord Android class. It has method like read(byte[] audioData, int offsetInBytes, int sizeInBytes). So maybe it is possible to connect this two utilities in this way? Problems might have occurred with configuring AudioRecord and with converting the result to mp3 or wav format after recording.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2020-12-02 18:02

I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.

0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2020-12-02 18:04
Late Answer, but for the first Exception, You have to destroy Your SpeechRecognizer after this what You want has done, for example (in onStop() or onDestroy() or directly after You don´t need the SpeechRecognizer anymore):
```
    if (YourSpeechRecognizer != null)
    {
        YourSpeechRecognizer.stopListening();
        YourSpeechRecognizer.cancel();
        YourSpeechRecognizer.destroy();
    }
```
0 讨论(0)
发布评论:

提交评论
- 加载中...