Google Speech-to-text API, InvalidArgument: 400 Must use single channel (mono)

萝らか妹 提交于 2020-01-02 13:30:09

问题


I keep getting this error InvalidArgument: 400 in google Speech-to-text, and the problem seems to be that I an using a 2 channel audio(Stereo), and the API is waiting for a wav in (Mono).

If I convert the file in a audio editor it might work, but I cannot use an audio editor to convert a batch of files. Is there a way to change the Audio type in either Python or Google Cloud.

Note: I already tried with the "wave module" but I kept getting an error #7 for file type not recognize(I couldn't read the wav file with the module wave from Python)

-ERROR- InvalidArgument: 400 Must use single channel (mono) audio, but WAV header indicates 2 channels.


回答1:


Assuming you're using the google-cloud-speech library, you could use the audio_channel_count property in your recognitionConfig and specify the number of channels in the input audio data (it defaults to one channel(mono)). You could do something like this:

from google.cloud import speech
client = speech.SpeechClient()
results = client.recognize(
    audio=speech.types.RecognitionAudio(
        uri='gs://your-bucket/recording.wav',
    ),
    config=speech.types.RecognitionConfig(
        encoding='LINEAR16',
        language_code='en-US',
        sample_rate_hertz=44100,
        audio_channel_count=2,
    ),
)

See the API doc for further info.




回答2:


You should use the below function to dynamically return Audio Chanel & frame_rate it takes the audio file path and returns frame rate and number of Chanel

def frame_rate_channel(audio_file_name): print(audio_file_name) with wave.open(audio_file_name, "rb") as wave_file: frame_rate = wave_file.getframerate() channels = wave_file.getnchannels() return frame_rate,channels



来源:https://stackoverflow.com/questions/55106509/google-speech-to-text-api-invalidargument-400-must-use-single-channel-mono

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!