Speech to Text from own sound file

后端 未结 3 852
盖世英雄少女心
盖世英雄少女心 2020-11-30 05:53

As you probably know, implementing speech-to-text is pretty easy with the Android API. All you have to do is just call up the API\'s intent and it will return text for you.

相关标签:
3条回答
  • 2020-11-30 06:29

    It is currently not possible to send your own audio file to google for processing but instead you can use your speaker and microphone in your android device to use your audio file as an input to google voice recognition.

    First you must have an audio file which may be in your SD card then use the following steps:

    1) create a method by any name you wish

    2) within that method first write code for using google speech recognition

    3) Following that code write the code for using speaker to play your audio file which will then become as an input to google speech recognition

     //code for google voice recognition
     Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
     intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
     intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
     intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
                            getString(R.string.speech_prompt));
     try {
          startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
     } catch (ActivityNotFoundException a) {
     Toast.makeText(getApplicationContext(),
                                getString(R.string.speech_not_supported),
                                Toast.LENGTH_SHORT).show();
    
    //code for playing the audio file which you wish to give as an input
        MediaPlayer mp = new MediaPlayer();
        try {
         mp.setDataSource(file); // here file is the location of the audio file you wish to use an input
            mp.prepare();
            mp.start();
        } catch (Exception e) {
            e.printStackTrace();
        }
    

    For reference see my blog https://sureshkumarask.wordpress.com/2017/03/19/how-to-give-our-own-audio-file-as-an-input-to-any-speech-recognizer/

    i have enclosed the link for the java file in my blog.

    0 讨论(0)
  • 2020-11-30 06:39

    The API does not allow it, but see this blog post and its comments for a potential workaround. Also make sure that your file contains high quality audio (at least 16 bit and 16 kHz) to get a better transcription.

    See also:

    • Voice recognition on android with recorded sound clip?
    0 讨论(0)
  • 2020-11-30 06:42

    I got a solution that is working well to have speech to text from a sound file. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

    I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

    Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

    "(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

    However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

    I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

    Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

    The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.

    0 讨论(0)
提交回复
热议问题