I'm looking to change the speed of a sound file, but am at a loss as to how to accomplish it. I'm assuming that some type of interpolation has to take place in the case of slowing it down, but am unsure how to accomplish a speed up - perhaps an average of several samples? Whether it changes the tempo or pitch doesn't really matter at the moment, I'd like to learn how to accomplish both, but would like to at least accomplish one or the other to begin.
If anyone has any references to the math behind these types of operations, they would be greatly appreciated!
Thanks, Ben
There are two options to speed up the playback of a sound file:
- Increase the sample rate
- Reduce the number of samples per unit of time.
In either of these methods, the increase in playback speed will have a corresponding change in the pitch of the sound.
Increasing the sample rate
Increasing the sample rate will increase the playback speed of the sound. For example, going from a 22 KHz sampling rate to 44 KHz will make the playback sound twice as fast as the original. In this method, the original sampling data is unaltered -- only the audio playback settings need to be changed.
Reduce the number of samples per unit of time
In this method, the playback sampling rate is kept constant, but the number of samples are reduced -- some of the samples are thrown out.
The naive approach to make the playback of the sound be twice the speed of the original is to remove every other sample, and playback at the original playback sampling rate.
However, with this approach, some of the information will be lost, and I would expect that some artifacts will be introduced to the audio, so it's not the most desirable approach.
Although I haven't tried it myself, the idea of averaging the samples to create a new sample to be a good approach to start with. This would seem to mean that rather than just throwing out audio information, it can be "preserved" to an extent by the averaging process.
As a rough idea of the process, here's a piece of pseudocode to double the speed of playback:
original_samples = [0, 0.1, 0.2, 0.3, 0.4, 0.5]
def faster(samples):
new_samples = []
for i = 0 to samples.length:
if i is even:
new_samples.add(0.5 * (samples[i] + samples[i+1]))
return new_samples
faster_samples = faster(original_samples)
I've also posted an answered to the a question "Getting started with programmatic audio" where I went into some details about some basic audio manipulation that can be performed, so maybe that might be of interest as well.
There is a good explanation about sample rate conversion on Wikipedia. Basically you convert your signal to a least common multiple of the two sample rates, filter out any frequencies that don't fit in the target sample rate (or didn't come from the source) and pick new samples at the target samplerate. There are mathematical tricks to make the computation take drastically less resources (polyphase decomposition) but this should get you started.
来源:https://stackoverflow.com/questions/939904/changing-speed-of-a-sound-file