OS X / iOS - Sample rate conversion for a buffer using AudioConverterFillComplexBuffer

后端 未结 1 1735
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-01 10:40

I\'m writing a CoreAudio backend for an audio library called XAL. Input buffers can be of various sample rates. I\'m using a single audio unit for output. Idea is to convert the

1条回答
  •  一向
    一向 (楼主)
    2021-02-01 11:08

    Working code for Core Audio sample rate conversion and channel count conversion, using Audio Converter Services (now available as a part of the BSD-licensed XAL audio library):

    void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
    {
        if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel || 
            buffer->getChannels() != unitDescription.mChannelsPerFrame || 
            buffer->getSamplingRate() != unitDescription.mSampleRate)
        {
            // describe the input format's description
            AudioStreamBasicDescription inputDescription;
            memset(&inputDescription, 0, sizeof(inputDescription));
            inputDescription.mFormatID = kAudioFormatLinearPCM;
            inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
            inputDescription.mChannelsPerFrame = buffer->getChannels();
            inputDescription.mSampleRate = buffer->getSamplingRate();
            inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
            inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
            inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
            inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
    
            // copy conversion output format's description from the
            // output audio unit's description.
            // then adjust framesPerPacket to match the input we'll be passing.
    
            // framecount of our input stream is based on the input bytecount.
            // output stream will have same number of frames, but different
            // number of bytes.
            AudioStreamBasicDescription outputDescription = unitDescription;
            outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
            outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
    
            // create an audio converter
            AudioConverterRef audioConverter;
            OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
            if(!audioConverter)
            {
                // bail out
                free(*stream);
                *streamSize = 0;
                *stream = (unsigned char*)malloc(0);
                return;
            }
    
            // calculate number of bytes required for output of input stream.
            // allocate buffer of adequate size.
            UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerPacket); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
            unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
            memset(outputBuffer, 0, outputBytes);
    
            // describe input data we'll pass into converter
            AudioBuffer inputBuffer;
            inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
            inputBuffer.mDataByteSize = *streamSize;
            inputBuffer.mData = *stream;
    
            // describe output data buffers into which we can receive data.
            AudioBufferList outputBufferList;
            outputBufferList.mNumberBuffers = 1;
            outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
            outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
            outputBufferList.mBuffers[0].mData = outputBuffer;
    
            // set output data packet size
            UInt32 outputDataPacketSize = outputBytes / outputDescription.mBytesPerPacket;
    
            // fill class members with data that we'll pass into
            // the InputDataProc
            _converter_currentBuffer = &inputBuffer;
            _converter_currentInputDescription = inputDescription;
    
            // convert
            OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
                                                              CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
                                                              this, /* void *inInputDataProcUserData */
                                                              &outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
                                                              &outputBufferList, /* AudioBufferList *outOutputData */
                                                              NULL /* AudioStreamPacketDescription *outPacketDescription */
                                                              );
    
            // change "stream" to describe our output buffer.
            // even if error occured, we'd rather have silence than unconverted audio.
            free(*stream);
            *stream = outputBuffer;
            *streamSize = outputBytes;
    
            // dispose of the audio converter
            AudioConverterDispose(audioConverter);
        }
    }
    
    
    OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
                                                                    UInt32* ioNumberDataPackets,
                                                                    AudioBufferList* ioData,
                                                                    AudioStreamPacketDescription** ioDataPacketDescription,
                                                                    void* inUserData)
    {
        if(ioDataPacketDescription)
        {
            xal::log("_converterComplexInputDataProc cannot provide input data; it doesn't know how to provide packet descriptions");
            *ioDataPacketDescription = NULL;
            *ioNumberDataPackets = 0;
            ioData->mNumberBuffers = 0;
            return 501;
        }
    
        CoreAudio_AudioManager *self = (CoreAudio_AudioManager*)inUserData;
    
        ioData->mNumberBuffers = 1;
        ioData->mBuffers[0] = *(self->_converter_currentBuffer);
    
        *ioNumberDataPackets = ioData->mBuffers[0].mDataByteSize / self->_converter_currentInputDescription.mBytesPerPacket;
        return 0;
    }
    

    In the header, as part of the CoreAudio_AudioManager class, here are relevant instance variables:

        AudioStreamBasicDescription unitDescription;
        AudioBuffer *_converter_currentBuffer;
        AudioStreamBasicDescription _converter_currentInputDescription;
    

    A few months later, I'm looking at this and I've realized that I didn't document the changes.

    If you are interested in what the changes were:

    • look at the callback function CoreAudio_AudioManager::_converterComplexInputDataProc
    • one has to properly specify the number of output packets into ioNumberDataPackets
    • this has required introduction of new instance variables to hold both the buffer (the previous inUserData) and the input description (used to calculate the number of packets to be fed into Core Audio's converter)
    • this calculation of "output" packets (those fed into the converter) is done based on amount of data that our callback received, and the number of bytes per packet that the input format contains

    Hopefully this edit will help a future reader (myself included)!

    0 讨论(0)
提交回复
热议问题