How to receive answer from Google Assistant as a String, not as an audio stream

问题

I am using the python libraries from the Assistant SDK for speech recognition via gRPC. I have the speech recognized and returned as a string calling the method resp.result.spoken_request_text from \googlesamples\assistant\__main__.py and I have the answer as an audio stream from the assistant API with the method resp.audio_out.audio_data also from \googlesamples\assistant\__main__.py

I would like to know if it is possible to have the answer from the service as a string as well (hoping it is available in the service definition or that it could be included), and how I could access/request the answer as string.

Thanks in advance.

回答1:

Currently (Assistant SDK Developer Preview 1), there is no direct way to do this. You can probably feed the audio stream into a Speech-to-Text system, but that really starts getting silly.

Speaking to the engineers on this subject while at Google I/O, they indicated that there are some technical complications on their end to doing this, but they understand the use cases. They need to see questions like this to know that people want the feature.

Hopefully it will make it into an upcoming Developer Preview.