问题
Bearing in mind that the Microsoft/Azure Cognitive Services' "Speech Service" is currently going through a rationalisation exercise, as far as I can tell from looking at
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#speech-to-text
https://docs.microsoft.com/en-us/azure/cognitive-services/speech/home
only .wav
binaries are acceptable, with anything else giving the response:
{"Message":"Unsupported audio format"}
Is there any other way to discover the acceptable audio formats/encodings/etc., or is this it?
[Bonus points for tips on preprocessing arbitrary/.m4a
audio formats in python pydub
so that they meet the bar - currently works for .mp3
but not for .m4a
].
Thanks!
回答1:
The currently support format is single-channel (mono) WAV / PCM with a sampling rate of 16 kHz. More format and codec support will be added in future.
来源:https://stackoverflow.com/questions/51614216/what-audio-formats-are-supported-by-azure-cognitive-services-speech-service-ss