问题
I am writing a music player and I want to normalize the audio volume across different songs.
I could think of some different ways to do this, e.g.:
Go through all PCM samples (assume floating point from -1 to 1) and select the m = max(abs(sample)). Then apply the factor 1/m to all the PCM samples. This would make the peak be at 1.
Go through the PCM stream and for each position, take the Hanning window of some width around it, calculate the average of absolute samples and from those data, pick the maximum and normalize everything.
The same as 2 but some other way to get some sort of averaged value.
2 and 3 have the disadvantage that I might need some clipping and thus loose some quality. By not normalizing to 1 but to 0.95 or so, I maybe could avoid this to some degree, though. But I think 2 and 3 have the advantage that this might be the more natural normalization for the user. Wikipedia also has some information about this and mentions RMS, ReplayGain or EBU R128 to measure the loudness of a song.
How are other popular music players (like iTunes or so) doing this?
回答1:
iTunes uses the Sound Check technology. "Sound Check is a proprietary Apple Inc. technology similar in function to ReplayGain. It is available in iTunes and on the iPod." (from Wikipedia) So, this is no solution for me.
It seems that ReplayGain is the most common technic. The algorithm is explained here. A sample implementation is mp3gain (GPL) or ffmpeg-replaygain (GPL, derived from mp3gain). I have my own implementation now in my MusicPlayer project (BSD-licence).
See also these projects with implementations:
- http://sox.sourceforge.net/
- http://r128gain.sourceforge.net/
- official ReplayGain homepage
- official ReplayGain 1.0 specification
- Wikipedia: ReplayGain
来源:https://stackoverflow.com/questions/12481524/audio-volume-normalization