问题
I have managed the "overview tutorial" : https://cloud.google.com/speech/docs/getting-started Then I tried to use my own audio file . I uploaded a .flac file with a sample rate of 16000Hz.
I only changed the sync-request.json
file below with my own audio file hosted on google cloud storage (gs://my-bucket/test4.flac
)
{
"config": {
"encoding":"flac",
"sample_rate": 16000
},
"audio": {
"uri":"gs://my-bucket/test4.flac"
}
}
The file is well recognized but the request return an "INVALID_ARGUMENT" error
{
"error": {
"code": 400,
"message": "Unable to recognize speech, code=-73541, possible error in recognition config. Please correct the config and retry the request.",
"status": "INVALID_ARGUMENT"
}
}
回答1:
As per this answer, all encodings support only 1 channel (mono) audio
I was creating the FLAC file with this command:
ffmpeg -i test.mp3 test.flac
Sample rate in request does not match FLAC header
But adding the -ac 1
(setting number of audio channels to 1) fixed this issue.
ffmpeg -i test.mp3 -ac 1 test.flac
Here is my full Node.js
code
const Speech = require('@google-cloud/speech');
const projectId = 'EnterProjectIdGeneratedByGoogle';
const speechClient = Speech({
projectId: projectId
});
// The name of the audio file to transcribe
var fileName = '/home/user/Documents/test/test.flac';
// The audio file's encoding and sample rate
const options = {
encoding: 'FLAC',
sampleRate: 44100
};
// Detects speech in the audio file
speechClient.recognize(fileName, options)
.then((results) => {
const transcription = results[0];
console.log(`Transcription: ${transcription}`);
}, function(err) {
console.log(err);
});
Sample rate could be 16000 or 44100 or other valid ones, and encoding can be FLAC or LINEAR16. Cloud Speech Docs
回答2:
My bad, as the doc "https://cloud.google.com/speech/docs/basics", the .flac file have to be a 16-bit PCM
Sumup:
Encoding: FLAC
Channels: 1 @ 16-bit
Samplerate: 16000Hz
/!\ pay attention to not export a stereo file (2 channels) file which throw an other error (only one channel accepted) Google speech API internal server error -83104
来源:https://stackoverflow.com/questions/39620198/google-cloud-speech-syncrecognize-invalid-argument