Speech to Text from own sound file

拥有回忆 提交于 2019-11-27 04:26:09
Kaarel

The API does not allow it, but see this blog post and its comments for a potential workaround. Also make sure that your file contains high quality audio (at least 16 bit and 16 kHz) to get a better transcription.

See also:

I got a solution that is working well to have speech to text from a sound file. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.

It is currently not possible to send your own audio file to google for processing but instead you can use your speaker and microphone in your android device to use your audio file as an input to google voice recognition.

First you must have an audio file which may be in your SD card then use the following steps:

1) create a method by any name you wish

2) within that method first write code for using google speech recognition

3) Following that code write the code for using speaker to play your audio file which will then become as an input to google speech recognition

 //code for google voice recognition
 Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
 intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                        RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
 intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
 intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
                        getString(R.string.speech_prompt));
 try {
      startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
 } catch (ActivityNotFoundException a) {
 Toast.makeText(getApplicationContext(),
                            getString(R.string.speech_not_supported),
                            Toast.LENGTH_SHORT).show();

//code for playing the audio file which you wish to give as an input
    MediaPlayer mp = new MediaPlayer();
    try {
     mp.setDataSource(file); // here file is the location of the audio file you wish to use an input
        mp.prepare();
        mp.start();
    } catch (Exception e) {
        e.printStackTrace();
    }

For reference see my blog https://sureshkumarask.wordpress.com/2017/03/19/how-to-give-our-own-audio-file-as-an-input-to-any-speech-recognizer/

i have enclosed the link for the java file in my blog.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!