Managing text-to-speech and speech recognition at same time in iOS

自古美人都是妖i 提交于 2019-12-21 23:02:24

问题


I'd like my iOS app to use text-to-speech to read to the user some information that it receives from a server, and I'd also like to allow the user to stop such speech by a voice command. I have tried speech recognition frameworks for iOS like OpenEars and I find the problem that it is listening and detecting the information the app itself is "saying" and it intereferes in the recognition of user's voice commands.

Has somebody dealt with this scenario in iOS and found a solution for that? Thanks in advance


回答1:


It is not a trivial thing to implement. Unfortunately iOS and others record the sound which is playing through speaker. The only choice you have is to use the headset. In that case speech recognition can continue listening for input. In Openears recognition is disabled during TTS unless headset is plugged in.

If you still want to implement this feature which is called "barge-in" you have to do the following:

  1. Store the audio you play though microphone
  2. Implement noise cancellation algorithm which effectively will remove the audio from the recording. You can use cross-correlation to find a proper offset in the recording and spectral subtraction to remove the audio.
  3. Recognize the speech in remaining signal.

It is not possible to do that without significant modification of openears sources.

Related question is Android Speech Recognition while music is playing



来源:https://stackoverflow.com/questions/37066457/managing-text-to-speech-and-speech-recognition-at-same-time-in-ios

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!