SFSpeechRecognizer - detect end of utterance

别说谁变了你拦得住时间么 提交于 2019-11-28 09:04:32
Joe Aspara

It seems that isFinal flag doesn't became true when user stops talking as expected. I guess this is a wanted behaviour by Apple, because the event "User stops talking" is an undefined event.

I believe that the easiest way to achieve your goal is to do the following:

  • You have to estabilish an "interval of silence". That means if the user doesn't talk for a time greater than your interval, he has stopped talking (i.e. 2 seconds).

  • Create a Timer at the beginning of the audio session:

var timer = NSTimer.scheduledTimerWithTimeInterval(2, target: self, selector: "didFinishTalk", userInfo: nil, repeats: false)

  • when you get new transcriptions in recognitionTaskinvalidate and restart your timer

    timer.invalidate() timer = NSTimer.scheduledTimerWithTimeInterval(2, target: self, selector: "didFinishTalk", userInfo: nil, repeats: false)

  • if the timer expires this means the user doesn't talk from 2 seconds. You can safely stop Audio Session and exit

Based on my test on iOS10, when shouldReportPartialResults is set to false, you have to wait 60 seconds to get the result.

I am using Speech to text in an app currently and it is working fine for me. My recognitionTask block is as follows:

recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
        var isFinal = false

        if let result = result, result.isFinal {
            print("Result: \(result.bestTranscription.formattedString)")
            isFinal = result.isFinal
            completion(result.bestTranscription.formattedString, nil)
        }

        if error != nil || isFinal {
            self.audioEngine.stop()
            inputNode.removeTap(onBus: 0)

            self.recognitionRequest = nil
            self.recognitionTask = nil
            completion(nil, error)
        }
    })
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!