问题
I want to take two sounds that contain a dominant frequency and say 'this one is higher than this one'. I could do FFT, find the frequency with the greatest amplitude of each and compare them. I'm wondering if, as I have a specific task, there may be a simpler algorithm.
The sounds are quite dirty with many frequencies, but contain a clear dominant pitch. They aren't perfectly produced sine waves.
回答1:
Given that the sounds are quite dirty, I would suggest starting to develop the algorithm with the output of an FFT as it'll be much simpler to diagnose any problems. Then when you're happy that it's working you can think about optimising/simplifying.
As a rule of thumb when developing this kind of numeric algorithm, I always try to operate first in the most relevant domain (in this case you're interested in frequencies, so analyse in frequency space) at the start, and once everything is behaving itself consider shortcuts/optimisations. That way you can test the latter solution against the best-performing former.
回答2:
In the general case, decent pitch detection/estimation generally requires a more sophisticated algorithm than looking at FFT peaks, not a simpler algorithm.
回答3:
There are a variety of pitch detection methods ranging in sophistication from counting zero-crossing (which obviously won't work in your case) to extremely complex algorithms.
While the frequency domain methods seems most appropriate, it's not as simple as "taking the FFT". If your data is very noisy, you may have spurious peaks that are higher than what you would consider to be the dominant frequency. One solution is use window overlapping segments of your signal, and do STFTs, and average the results. But this raises more questions: how big should the windows be? In this case, it depends on how far apart you expect those dominant peaks to be, how long your recordings are, etc. (Note: FFT methods can resolve to better than one-bin size by taking into account phase information. In this case, you would have to do something more complex than averaging all your FFT windows together).
Another approach would be a time-domain method, such as YIN:
http://recherche.ircam.fr/equipes/pcm/cheveign/pss/2002_JASA_YIN.pdf
Wikipedia discusses some more methods:
http://en.wikipedia.org/wiki/Pitch_detection_algorithm
You can also explore some more methods in chapter 9 of this book:
http://www.amazon.com/DAFX-Digital-Udo-ouml-lzer/dp/0471490784
You can get matlab sourcecode for yin from chapter 9 of that book here:
http://www2.hsu-hh.de/ant/dafx2002/DAFX_Book_Page_2nd_edition/matlab.html
来源:https://stackoverflow.com/questions/11431102/quickest-and-easiest-algorithm-for-comparing-the-frequency-content-of-two-sounds