Open source code for voice detection and discrimination

前端 未结 8 2117
囚心锁ツ
囚心锁ツ 2021-01-31 17:49

I have 15 audio tapes, one of which I believe contains an old recording of my grandmother and myself talking. A quick attempt to find the right place didn\'t turn it up. I don

8条回答
  •  长发绾君心
    2021-01-31 18:34

    if you are familiar with java you could try to feed the audio files throu minim and calculate some FFT-spectrums. Silence could be detected by defining a minimum level for the amplitude of the samples (to rule out noise). To seperate speech from music the FFT spectrum of a time-window can be used. Speech uses some very distinct frequencybands called formants - especially for vovels - music is more evenly distributed among the frequency spectrum.

    You propably won't get a 100% separation of the speech/music blocks but it should be good enought to tag the files and only listen to the interesting parts.

    http://code.compartmental.net/tools/minim/

    http://en.wikipedia.org/wiki/Formant

提交回复
热议问题