问题
I want to make a program that detects the note that is being played in front of the microphone. I am testing the FFT function of Naudio, but with the tests that I did in audacity it seems that FFT does not detect the pitch correctly. I played an C5, but the highest pick was at E7.
I changed the first dropdown box in the frequency analysis window to "enchanced autocorrelation" and after that the highest pick was at C5.
I googled "enchanced autocorrelation" and had no luck.
回答1:
The highest peak in an audio spectrum is not necessarily the musical pitch as a human would perceive it, especially in a sound with strong overtones. That's because pitch is a human psycho-perceptual phenomena, the brain will often deduce frequencies that aren't even present in a waveform.
Auto-correlation methods of frequency or pitch estimation (roughly, finding how far apart even a funny-looking and/or non-sinusoidal waveform repeats in time) is usually a better match for what a human would call pitch. The reason for various enhancements to the autocorrelation algorithm is that simple autocorrelation will find an near infinite number of repeating wavelengths (e.g. if it repeats every 1 second it also repeats twice every 2 seconds, etc.) So the trick is to weight the correlation to somehow statistically better match what a human would guess about the same waveform.
回答2:
You are likely getting thrown off by harmonics. Have you tried testing with a sine wave to see if your NAudio's FFT is in the ballpark?
See these references: http://cnx.org/content/m11714/latest/
http://www.gamedev.net/community/forums/topic.asp?topic_id=506592&whichpage=1
Line 48 in Spectrum.cpp
in the Audacity source code seems to be close to what you want. They also reference an IEEE paper by Tolonen and Karjalainen.
回答3:
Well, if you can live with GPLv2, why not take a peek at the Audacity source code?
http://audacity.sourceforge.net/download/beta_source
来源:https://stackoverflow.com/questions/4236475/how-can-i-do-real-time-pitch-detection-in-net