问题
I'm recording short audio files (a few seconds) in Chrome using mediaDevices.getUserMedia()
, saving the file to Firebase Storage, and then trying to send the files to Google Cloud Speech-to-Text from a Firebase Cloud Function. I'm getting back this error message:
INVALID_ARGUMENT: Invalid recognition 'config': bad encoding.
Google's documentation says that this error message means
Your audio data might not be encoded correctly or is encoded with a codec different than what you've declared in the RecognitionConfig. Check the audio input and make sure that you've set the encoding field correctly.
In the browser I set up the microphone:
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {
var options = {
audioBitsPerSecond : 128000,
mimeType : 'audio/webm;codecs=opus'
};
const mediaRecorder = new MediaRecorder(stream, options);
mediaRecorder.start();
...
According to this answer Chrome only supports two codecs:
audio/webm
audio/webm;codecs=opus
Actually, that's one media format and one codec. This blog post also says that Chrome only supports the Opus codec.
I set up my Firebase Cloud Function:
// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');
// Creates a client
const client = new speech.SpeechClient();
const gcsUri = 'gs://my-app.appspot.com/my-file';
const encoding = 'Opus';
const sampleRateHertz = 128000;
const languageCode = 'en-US';
const config = {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
};
const audio = {
uri: gcsUri,
};
const request = {
config: config,
audio: audio,
};
// Detects speech in the audio file
return response = client.recognize(request) // square brackets in ES6 construct an array
.then(function(response) {
console.log(response);
...
The audio encoding matches between the browser and the Google Speech-to-Text request. Why does Google Speech tell me that the audio encoding is bad?
I also tried using the default options in the browser, with the same error message:
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start();
In the Firebase Cloud Function I tried leaving out the line const encoding = 'Opus';
, which resulted in an error encoding is not defined
. I tried this line const encoding = '';
which resulted in the INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..
error.
I'm getting a similar error message from IBM Watson Speech-to-Text. The file plays back without a problem.
来源:https://stackoverflow.com/questions/60747880/google-cloud-speech-to-text-invalid-argument-invalid-recognition-config-ba