Perform Audio Analysis with FFT

问题

I've been stuck on this problem for days now and have looked through nearly every related StackOverflow page. Through this, I now have a much greater understanding of what FFT is and how it works. Despite this, I'm having extreme difficulties implementing it into my application.

In short, what I am trying to do is make a spectrum visualizer for my application (Similar to this). From what I've gathered, I'm pretty sure I need to use the magnitudes of the sound as the heights of my bars. So with all this in mind, currently I am able to analyze an entire .caf file all at once. To do this, I am using the following code:

    let audioFile = try!  AVAudioFile(forReading: soundURL!)
    let frameCount = UInt32(audioFile.length)

    let buffer = AVAudioPCMBuffer(PCMFormat: audioFile.processingFormat, frameCapacity: frameCount)
    do {
        try audioFile.readIntoBuffer(buffer, frameCount:frameCount)
    } catch {

    }
    let log2n = UInt(round(log2(Double(frameCount))))

    let bufferSize = Int(1 << log2n)

    let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2))

    var realp = [Float](count: bufferSize/2, repeatedValue: 0)
    var imagp = [Float](count: bufferSize/2, repeatedValue: 0)
    var output = DSPSplitComplex(realp: &realp, imagp: &imagp)

    vDSP_ctoz(UnsafePointer<DSPComplex>(buffer.floatChannelData.memory), 2, &output, 1, UInt(bufferSize / 2))

    vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD))

    var fft = [Float](count:Int(bufferSize / 2), repeatedValue:0.0)
    let bufferOver2: vDSP_Length = vDSP_Length(bufferSize / 2)
    vDSP_zvmags(&output, 1, &fft, 1, bufferOver2)

This works fine and outputs a long array of data. However, the problem with this code is it analyzes the entire audio file at once. What I need is to be analyzing the audio file as it is playing, very similar to this video: Spectrum visualizer.

So I guess my question is this: How do you perform FFT analysis while the audio is playing?

Also, on top of this, how do I go about converting the output of an FFT analysis to actual heights for a bar? One of the outputs I received for an audio file using the FFT analysis code from above was this: http://pastebin.com/RBLTuGx7. The only reason for the pastebin is due to how long it is. I'm assuming I average all these numbers together and use those values instead? (Just for reference, I got that array by printing out the 'fft' variable in the code above)

I've attempted reading through the EZAudio code, however I am unable to find how they are reading in samples of audio in live time. Any help is greatly appreciated.

回答1:

Here's how it is done in AudioKit, using EZAudio's FFT tools:

Create a class for your FFT that will hold the data:

@objc public class AKFFT: NSObject, EZAudioFFTDelegate {

    internal let bufferSize: UInt32 = 512
    internal var fft: EZAudioFFT?

    /// Array of FFT data
    public var fftData = [Double](count: 512, repeatedValue: 0.0)

...
}

Initialize the class and setup the FFT. Also install the tap on the appropriate node.

public init(_ input: AKNode) {
    super.init()
    fft = EZAudioFFT.fftWithMaximumBufferSize(vDSP_Length(bufferSize), sampleRate: 44100.0, delegate: self)
    input.avAudioNode.installTapOnBus(0, bufferSize: bufferSize, format: AKManager.format) { [weak self] (buffer, time) -> Void in
        if let strongSelf = self {
            buffer.frameLength = strongSelf.bufferSize;
            let offset: Int = Int(buffer.frameCapacity - buffer.frameLength);
            let tail = buffer.floatChannelData[0];
            strongSelf.fft!.computeFFTWithBuffer(&tail[offset], withBufferSize: strongSelf.bufferSize)
        }
    }
}

Then implement the callback to load your internal fftData array:

@objc public func fft(fft: EZAudioFFT!, updatedWithFFTData fftData: UnsafeMutablePointer<Float>, bufferSize: vDSP_Length) {
    dispatch_async(dispatch_get_main_queue()) { () -> Void in
        for i in 0...511 {
            self.fftData[i] = Double(fftData[i])
        }
    }
}

AudioKit's implementation may change so you should check https://github.com/audiokit/AudioKit/ to see if any improvements were made. EZAudio is at https://github.com/syedhali/EZAudio

来源：https://stackoverflow.com/questions/34712707/perform-audio-analysis-with-fft

标签

ios

swift

fft