Voice recognition fails to work when the voice is under recording

后端 未结 1 1518
北荒
北荒 2021-02-15 11:03

I am working on a function that when a button is pressed, it will launch voice recognition and at the same time will record what the user says. Codes as follows:



        
1条回答
  •  醉话见心
    2021-02-15 11:52

    --EDIT-- module for Opus-Record WHILE Speech-Recognition also runs

    --EDIT-- 'V1BETA1' streaming, continuous, recognition with minor change to sample project. Alter that 'readData()', so the raw PCM in 'sData' is shared by 2 threads ( fileSink thread , recognizerAPI thread from sample project). For the sink, just hook up an encoder using a PCM stream refreshed at each 'sData' IO. remember to CLO the stream and it will work. review 'writeAudiaDataToFile()' for more on fileSink....

    --EDIT-- see this thread

    There is going to be a basic conflict over the HAL and the microphone buffer when you try to do:

    speechRecognizer.startListening(recognizerIntent); // <-- needs mutex use of mic
    

    and

    mediaRecorder.start(); // <-- needs mutex use of mic
    

    You can only choose one or the other of the above actions to own the audio API's underlying the mic!

    If you want to mimic the functionality of Google Keep where you talk only once and as output from the one input process (your speech into mic) you get 2 separate types of output (STT and a fileSink of say the MP3) then you must split something as it exits the HAL layer from the mic.

    For example:

    1. Pick up the RAW audio as PCM 16 coming out of the mic's buffer

    2. Split the above buffer's bytes (you can get a stream from the buffer and pipe the stream 2 places)

    3. STRM 1 to the API for STT either before or after you encode it (there are STT APIs accepting both Raw PCM 16 or encoded)

    4. STRM 2 to an encoder, then to the fileSink for your capture of the recording

    Split can operate on either the actual buffer produced by the mic or on a derivative stream of those same bytes.

    For what you are getting into, I recommend you look at getCurrentRecording() and consumeRecording() here.

    STT API reference: Google "pultz speech-api". Note that there are use-cases on the API's mentioned there.

    • buferUtils
    • code
    • more code

    0 讨论(0)
提交回复
热议问题