Open source code for voice detection and discrimination

前端未结

关注

 8  2117

囚心锁ツ 2021-01-31 17:49

I have 15 audio tapes, one of which I believe contains an old recording of my grandmother and myself talking. A quick attempt to find the right place didn\'t turn it up. I don

8条回答

长发绾君心 (楼主)

2021-01-31 18:34

if you are familiar with java you could try to feed the audio files throu minim and calculate some FFT-spectrums. Silence could be detected by defining a minimum level for the amplitude of the samples (to rule out noise). To seperate speech from music the FFT spectrum of a time-window can be used. Speech uses some very distinct frequencybands called formants - especially for vovels - music is more evenly distributed among the frequency spectrum.

You propably won't get a 100% separation of the speech/music blocks but it should be good enought to tag the files and only listen to the interesting parts.

http://code.compartmental.net/tools/minim/

http://en.wikipedia.org/wiki/Formant

0 讨论(0)

查看其它8个回答
发布评论:

提交评论
- 加载中...