问题
Using sample code to transcribe conversation, but on recognized event i always get $ref$
when calling e.Result.UserId
.
I use 16-bit samples, 16 kHz sample rate, and a single channel (Mono) format for voice signatures. And 32-bit samples, 32 kHz sample rate, and a single channel (Mono) format for Transcribing conversations.
All code from: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-conversation-transcription-service
Is there any ideas? or .wav sample files i can use?
UPD
Seems like audio is not in the right format. Should be 16bit,16kHZ, 8 channels (Stereo Left=1, Stereo Right=2, Mono=3, Mono=4, Mono=5, Mono=6 ,Mono=7, Silenced Mono=8).
Here you can find enrollment_audio_steve.wav, enrollment_audio_katie.wav and conversation katiesteve.wav. It's in a correct format. However it doesn't allow to create signature from enrollment_audio_katie.wav. So it work with Steve.
It still seems that's it's only work with SpeechSDK devices. But i was able to recrod own audio, based on that format.
来源:https://stackoverflow.com/questions/57412753/azure-speech-to-text-conversation-transcribing-userid-always-return-ref