问题
I have a working tone detector which uses an FFT to determine whether a tone (or tone pair) of a particular frequency is present in an audio stream (if sufficiently above the noise floor). What method could I use to more precisely locate the onset time and duration of that tone? I am looking for something far more precise than the FFT frame duration (about 50 ms). The tone is assumed to be much longer than an FFT frame.
回答1:
If the particular frequency is known ahead of time, you could design a bandpass filter centered around that frequency and then just use an energy detector on the output. You'd have to account for the bulk delay through the filter, and probably also the rise and fall times of the steady-state response.
If you're using the FFT output to actually detect the tone, and you have sufficient memory to keep the recent past samples, you could get a rough estimate of the onset from the FFT, go back in time a few hundred milliseconds before, and start mixing the samples by a sinusoid at the detected frequency. Then run the mixed samples through a low-pass filter. Your tone detection, mixer, and LPF frequency resolutions/bandwidths will have to match, and again you'll need to consider the LPF characteristics.
回答2:
Sounds like DTMF detection. The standard technique for this is the Goertzel algorithm. You need one Goertzel detector for each frequency of interest, so you need to know the frequencies a priori.
来源:https://stackoverflow.com/questions/4456855/precise-tone-onset-duration-measurement