Detect human voice from audio file input

前端 未结 9 1831
清歌不尽
清歌不尽 2021-01-30 11:25

I am trying to implement automatic voice recording functionality, similar to the Talking Tom app. I use the following code to read input from the audio recorder and analyse the

9条回答
  •  既然无缘
    2021-01-30 11:49

    I tried to solve a similar problem on Windows. One thing I learned fast -- simple frequency analysis with a fast Fourier transform is not enough. Lots of noises hit human frequencies -- from simple taps on the microphone to clapping hands. Even some level of sophisticated filtering won't do it. I've found the easiest way is to take the noise to a cloud API and ask it to transcribe the speech. If the cloud API can transcribe to a reasonable length string, then I can continue recording -- else, stop recording. This does require that you sample some noise and send it to a cloud provider.

提交回复
热议问题