Google cloud speech syncrecognize “INVALID_ARGUMENT”

牧云@^-^@ 提交于 2019-12-12 11:26:43

问题


I have managed the "overview tutorial" : https://cloud.google.com/speech/docs/getting-started Then I tried to use my own audio file . I uploaded a .flac file with a sample rate of 16000Hz.

I only changed the sync-request.json file below with my own audio file hosted on google cloud storage (gs://my-bucket/test4.flac)

{
  "config": {
      "encoding":"flac",
      "sample_rate": 16000
  },
  "audio": {
      "uri":"gs://my-bucket/test4.flac"
  }
}

The file is well recognized but the request return an "INVALID_ARGUMENT" error

{
  "error": {
    "code": 400,
    "message": "Unable to recognize speech, code=-73541, possible error in recognition config. Please correct the config and retry the request.",
    "status": "INVALID_ARGUMENT"
  }
}

回答1:


As per this answer, all encodings support only 1 channel (mono) audio

I was creating the FLAC file with this command:

ffmpeg -i test.mp3 test.flac

Sample rate in request does not match FLAC header

But adding the -ac 1 (setting number of audio channels to 1) fixed this issue.

ffmpeg -i test.mp3 -ac 1 test.flac

Here is my full Node.js code

const Speech = require('@google-cloud/speech');
const projectId = 'EnterProjectIdGeneratedByGoogle';

const speechClient = Speech({
    projectId: projectId
});

// The name of the audio file to transcribe
var fileName = '/home/user/Documents/test/test.flac';


// The audio file's encoding and sample rate
const options = {
    encoding: 'FLAC',
    sampleRate: 44100
};

// Detects speech in the audio file
speechClient.recognize(fileName, options)
    .then((results) => {
        const transcription = results[0];
        console.log(`Transcription: ${transcription}`);
    }, function(err) {
        console.log(err);
    });

Sample rate could be 16000 or 44100 or other valid ones, and encoding can be FLAC or LINEAR16. Cloud Speech Docs




回答2:


My bad, as the doc "https://cloud.google.com/speech/docs/basics", the .flac file have to be a 16-bit PCM

Sumup:

Encoding: FLAC
Channels: 1 @ 16-bit
Samplerate: 16000Hz

/!\ pay attention to not export a stereo file (2 channels) file which throw an other error (only one channel accepted) Google speech API internal server error -83104



来源:https://stackoverflow.com/questions/39620198/google-cloud-speech-syncrecognize-invalid-argument

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!