Detect human voice from audio file input

前端未结

关注

 9  1845

I am trying to implement automatic voice recording functionality, similar to the Talking Tom app. I use the following code to read input from the audio recorder and analyse the

相关标签:

9条回答

佛祖请我去吃肉

2021-01-30 11:32

In the completely general case, this is an unsolved problem. In the practical sense...

First step is to get as noise-free a recording as possible. As others have noted, that starts with a directional microphone as focused on the sound you want to keep as possible.

Second step is filtering. As noted previously, the telephone company did a lot of work on which frequency ranges are actually needed by humans for speech comprehension. Filtering out frequencies outside that range will make the voice sound like... well, a telephone... but will get rid of more of the background noise.

If you want to go beyond that, things can get really complicated. There are some algorithms which, if you can show them a sample of what you consider noise on that particular recording, will analyse it and try to subtract it out without damaging the sound you want to keep too much. This is not simple programming; if I were you I'd seriously consider buying it from someone who has already gotten it right rather than trying to reinvent/reimplement it. I don't know whether any of them are available for Android or whether the typical Android box has enough computing power to execute them in anything like realtime. (I've used SoundSoap in the studio to remove A/C noise, and it works very well.)

In fact, my own inclincation would be to simplify the problem to a solved one: use the most directional and closest mike I could get, let Android do the recording... but then do the signal processing to clean it up later, using off-the-shelf tools. But I admit I'm biased because I have already invested in the latter.

0 讨论(0)
发布评论:

提交评论
- 加载中...
天命终不由人

2021-01-30 11:36

Most of them have misunderstood the question and their replies solves problems different from yours.

You should parse the audio in your buffer searching for frequencies in the voice human range. As soon you detect them, will mean someone has started talking, and you can start recording (don't forget to include the buffer too as it contains the first part of the speech).

Search for routines that prints the list of frequencies in an audio raw stream

0 讨论(0)
发布评论:

提交评论
- 加载中...
伪装坚强ぢ

2021-01-30 11:38

For voice detect, try ftt algorithm.

For noise, try speex library.

0 讨论(0)
发布评论:

提交评论
- 加载中...
误落风尘

2021-01-30 11:40

The way to process the input is to use a specialised library which removes noise.

For example, http://audacity.sourceforge.net, does noise removal.

So long as you have characterised the main types of noise, you should have only speech remaining.

It would be worthwhile collecting sampling data before the capture from the user, and after the user ended the capture, as this would provide at-the-time samples of noise in the environment. This is useful if each user faces unique background noise challenges.

0 讨论(0)
发布评论:

提交评论
- 加载中...
野趣味

2021-01-30 11:42

Have you considered using Microsoft's speech Recognition API? You can use a voice key utterance to begin recording, like how they say "computer" before asking the computer something in Star Trek. Use ISpRecognizer::CreateRecoContext to load your recognition grammar and start recognition. Then implement a check with ISpPhrase to see if you should begin recording or not.

0 讨论(0)
发布评论:

提交评论
- 加载中...
失恋的感觉

2021-01-30 11:47
If you want to have a clean recording you can
1. Filter noise from the voice, you can use FFT for that and apply filters such as lowpass, highpass and bandpass filters Filtering using FFT and Filters
2.After Filtration the noise would be decreased and you can use Voice recognition API's

API's

The more Filtering the better less noise More recognition, but be wary in filtering because it can also remove the Voice together with the noise.

Also read more about FFt

Fast Fourier Transform of Human Voice

Hope This Helps :)
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页