问题
I am trying to send the stream from UI to python API as stream. I need python Azure Speech logic to convert the speech to text. I am not sure about how to use pull/pusha audio input stream for speech to text
回答1:
There is a sample for using cognitive services speech sdk.
Specifically, for using it with pull stream, you may refer to: speech_recognition_with_pull_stream() , and for using it with push stream, you may refer to: speech_recognition_with_push_stream().
Hope it helps.
回答2:
In my case I receive an audio stream from some other source. When the connection with my application is made (upon reception of the first package), a PushAudioInputStream is started. This stream pushes the data to SDK for each package that is received. The speech recognition with push stream is thus used in this case. See snippet of code below. This has worked for my case.
if newConnection:
stream = speechsdk.audio.PushAudioInputStream()
speech_recognition_with_push_stream(stream)
stream_data = base64.b64decode(data)
stream.write(stream_data)
来源:https://stackoverflow.com/questions/60456894/azure-speech-sdk-speech-to-text-from-stream-using-python