Sound capture with OpenAL on iOS

问题

I am trying to do sound capture on iOS using OpenAL (I am writing a cross-platform library, that's why I avoid iOS-specific ways to record sound). Out of the box OpenAL capture does not work, but there exists a known workaround: open an output context before starting capture. This solution worked for me on iOS 5.0.

However on iOS 5.1.1, the workaround only helps for the first sample I try to record. (I switch my AudioSession to PlayAndRecord before starting capture and open the default output device. After recording my sample, I close the device and switch the session back to whatever it was.) For the second sample, re-opening the output context does not help and no sound is captured.

Is there a known way to deal with this problem?

// Here's what I do before starting the recording
oldAudioSessionCategory = [audioSession category];
[audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:nil];
[audioSession setActive:YES error:nil];
// We need to have an active context. If there is none, create one.
if (!alcGetCurrentContext()) {
    outputDevice = alcOpenDevice(NULL);
    outputContext = alcCreateContext(outputDevice, NULL);
    alcMakeContextCurrent(outputContext);
}

// Capture itself
inputDevice = alcCaptureOpenDevice(NULL, frequency, FORMAT, bufferSize);
....
alcCaptureCloseDevice(inputDevice);

// Restoring the audio state to whatever it had been before capture
if (outputContext) {
    alcDestroyContext(outputContext);
    alcCloseDevice(outputDevice);
}
[[AVAudioSession sharedInstance] setCategory:oldAudioSessionCategory 
                                 error:nil];

回答1:

Here is the code I use to emulate the capture extension. Some comments:

In the project as a whole, OpenKD is used for, e.g., threading primitives. You'll probably need to replace these calls.
I had to fight latency in starting the capture. As a result, I keep reading sound input constantly and throwing it away when not needed. (Such a solution is proposed, for example, here.) This, in turn, requires catching onResignActive notifications, in order to release control of the mic. You may or may not want to use such a kludge.
Instead of alcGetIntegerv(device, ALC_CAPTURE_SAMPLES, 1, &res), I have to define a separate function, alcGetAvailableSamples.

In short, this code is unlikely to be usable in your project as-is, but hopefully you can tweak it to your needs.

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <KD/kd.h>
#include <AL/al.h>
#include <AL/alc.h>

#include <AudioToolbox/AudioToolbox.h>
#import <Foundation/Foundation.h>
#import <UIKit/UIKit.h>

#include "KD/kdext.h"

struct InputDeviceData {
    int id;
    KDThreadMutex *mutex;
    AudioUnit audioUnit;
    int nChannels;
    int frequency;
    ALCenum format;
    int sampleSize;
    uint8_t *buf;
    size_t bufSize;    // in bytes
    size_t bufFilledBytes;  // in bytes
    bool started;
};

static struct InputDeviceData *cachedInData = NULL;

static OSStatus renderCallback (void                        *inRefCon,
                                AudioUnitRenderActionFlags  *ioActionFlags,
                                const AudioTimeStamp        *inTimeStamp,
                                UInt32                      inBusNumber,
                                UInt32                      inNumberFrames,
                                AudioBufferList             *ioData);
static AudioUnit getAudioUnit();
static void setupNotifications();
static void destroyCachedInData();
static struct InputDeviceData *setupCachedInData(AudioUnit audioUnit, ALCuint frequency, ALCenum format, ALCsizei bufferSizeInSamples);
static struct InputDeviceData *getInputDeviceData(AudioUnit audioUnit, ALCuint frequency, ALCenum format, ALCsizei bufferSizeInSamples);

/** I only have to use NSNotificationCenter instead of CFNotificationCenter
 *  because there is no published name for WillResignActive/WillBecomeActive
 *  notifications in CoreFoundation.
 */
@interface ALCNotificationObserver : NSObject
- (void)onResignActive;
@end
@implementation ALCNotificationObserver
- (void)onResignActive {
    destroyCachedInData();
}
@end

static void setupNotifications() {
    static ALCNotificationObserver *observer = NULL;
    if (!observer) {
        observer = [[ALCNotificationObserver alloc] init];
        [[NSNotificationCenter defaultCenter] addObserver:observer selector:@selector(onResignActive) name:UIApplicationWillResignActiveNotification object:nil];
    }
}

static OSStatus renderCallback (void                        *inRefCon,
                                AudioUnitRenderActionFlags  *ioActionFlags,
                                const AudioTimeStamp        *inTimeStamp,
                                UInt32                      inBusNumber,
                                UInt32                      inNumberFrames,
                                AudioBufferList             *ioData) {
    struct InputDeviceData *inData = (struct InputDeviceData*)inRefCon;

    kdThreadMutexLock(inData->mutex);
    size_t bytesToRender = inNumberFrames * inData->sampleSize;
    if (bytesToRender + inData->bufFilledBytes <= inData->bufSize) {
        OSStatus status;
        struct AudioBufferList audioBufferList; // 1 buffer is declared inside the structure itself.
        audioBufferList.mNumberBuffers = 1;
        audioBufferList.mBuffers[0].mNumberChannels = inData->nChannels;
        audioBufferList.mBuffers[0].mDataByteSize = bytesToRender;
        audioBufferList.mBuffers[0].mData = inData->buf + inData->bufFilledBytes;
        status = AudioUnitRender(inData->audioUnit, 
                                 ioActionFlags, 
                                 inTimeStamp, 
                                 inBusNumber, 
                                 inNumberFrames, 
                                 &audioBufferList);
        if (inData->started) {
            inData->bufFilledBytes += bytesToRender;
        }
    } else {
        kdLogFormatMessage("%s: buffer overflow", __FUNCTION__);
    }
    kdThreadMutexUnlock(inData->mutex);

    return 0;
}

static AudioUnit getAudioUnit() {
    static AudioUnit audioUnit = NULL;

    if (!audioUnit) {
        AudioComponentDescription ioUnitDescription;

        ioUnitDescription.componentType          = kAudioUnitType_Output;
        ioUnitDescription.componentSubType       = kAudioUnitSubType_VoiceProcessingIO;
        ioUnitDescription.componentManufacturer  = kAudioUnitManufacturer_Apple;
        ioUnitDescription.componentFlags         = 0;
        ioUnitDescription.componentFlagsMask     = 0;

        AudioComponent foundIoUnitReference = AudioComponentFindNext(NULL,
                                                                     &ioUnitDescription);
        AudioComponentInstanceNew(foundIoUnitReference,
                                  &audioUnit);

        if (audioUnit == NULL) {
            kdLogMessage("Could not obtain AudioUnit");
        }
    }

    return audioUnit;
}

static void destroyCachedInData() {
    OSStatus status;
    if (cachedInData) {
        status = AudioOutputUnitStop(cachedInData->audioUnit);
        status = AudioUnitUninitialize(cachedInData->audioUnit);
        free(cachedInData->buf);
        kdThreadMutexFree(cachedInData->mutex);
        free(cachedInData);
        cachedInData = NULL;
    }
}

static struct InputDeviceData *setupCachedInData(AudioUnit audioUnit, ALCuint frequency, ALCenum format, ALCsizei bufferSizeInSamples) {
    static int idCount = 0;
    OSStatus status;
    int bytesPerFrame = (format == AL_FORMAT_MONO8) ? 1 :
                        (format == AL_FORMAT_MONO16) ? 2 :
                        (format == AL_FORMAT_STEREO8) ? 2 :
                        (format == AL_FORMAT_STEREO16) ? 4 : -1;
    int channelsPerFrame = (format == AL_FORMAT_MONO8) ? 1 :
                           (format == AL_FORMAT_MONO16) ? 1 :
                           (format == AL_FORMAT_STEREO8) ? 2 :
                           (format == AL_FORMAT_STEREO16) ? 2 : -1;
    int bitsPerChannel = (format == AL_FORMAT_MONO8) ? 8 :
                         (format == AL_FORMAT_MONO16) ? 16 :
                         (format == AL_FORMAT_STEREO8) ? 8 :
                         (format == AL_FORMAT_STEREO16) ? 16 : -1;

    cachedInData = malloc(sizeof(struct InputDeviceData));
    cachedInData->id = ++idCount;
    cachedInData->format = format;
    cachedInData->frequency = frequency;
    cachedInData->mutex = kdThreadMutexCreate(NULL);
    cachedInData->audioUnit = audioUnit;
    cachedInData->nChannels = channelsPerFrame;
    cachedInData->sampleSize = bytesPerFrame;
    cachedInData->bufSize = bufferSizeInSamples * bytesPerFrame;
    cachedInData->buf = malloc(cachedInData->bufSize);
    cachedInData->bufFilledBytes = 0;
    cachedInData->started = FALSE;

    UInt32 enableOutput        = 1;    // to enable output
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Input,
                                  1,
                                  &enableOutput, sizeof(enableOutput));

    struct AudioStreamBasicDescription basicDescription;
    basicDescription.mSampleRate = (Float64)frequency;
    basicDescription.mFormatID = kAudioFormatLinearPCM;
    basicDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    basicDescription.mBytesPerPacket = bytesPerFrame;
    basicDescription.mFramesPerPacket = 1;
    basicDescription.mBytesPerFrame = bytesPerFrame;
    basicDescription.mChannelsPerFrame = channelsPerFrame;
    basicDescription.mBitsPerChannel = bitsPerChannel;
    basicDescription.mReserved = 0;

    status = AudioUnitSetProperty(audioUnit, 
                                  kAudioUnitProperty_StreamFormat, // property key 
                                  kAudioUnitScope_Output,        // scope
                                  1,                             // 1 is output
                                  &basicDescription, sizeof(basicDescription));      // value

    AURenderCallbackStruct renderCallbackStruct;
    renderCallbackStruct.inputProc = renderCallback;
    renderCallbackStruct.inputProcRefCon = cachedInData;
    status = AudioUnitSetProperty(audioUnit, 
                                  kAudioOutputUnitProperty_SetInputCallback, // property key 
                                  kAudioUnitScope_Output,        // scope
                                  1,                             // 1 is output
                                  &renderCallbackStruct, sizeof(renderCallbackStruct));      // value

    status = AudioOutputUnitStart(cachedInData->audioUnit);

    return cachedInData;
}

static struct InputDeviceData *getInputDeviceData(AudioUnit audioUnit, ALCuint frequency, ALCenum format, ALCsizei bufferSizeInSamples) {
    if (cachedInData && 
        (cachedInData->frequency != frequency ||
         cachedInData->format != format ||
         cachedInData->bufSize / cachedInData->sampleSize != bufferSizeInSamples)) {
            kdAssert(!cachedInData->started);
            destroyCachedInData();
        }
    if (!cachedInData) {
        setupCachedInData(audioUnit, frequency, format, bufferSizeInSamples);
        setupNotifications();
    }

    return cachedInData;
}


ALC_API ALCdevice* ALC_APIENTRY alcCaptureOpenDevice(const ALCchar *devicename, ALCuint frequency, ALCenum format, ALCsizei buffersizeInSamples) {
    kdAssert(devicename == NULL);    

    AudioUnit audioUnit = getAudioUnit();
    struct InputDeviceData *res = getInputDeviceData(audioUnit, frequency, format, buffersizeInSamples);
    return (ALCdevice*)res->id;
}

ALC_API ALCboolean ALC_APIENTRY alcCaptureCloseDevice(ALCdevice *device) {
    alcCaptureStop(device);
    return true;
}

ALC_API void ALC_APIENTRY alcCaptureStart(ALCdevice *device) {
    if (!cachedInData || (int)device != cachedInData->id) {
        // may happen after the app loses and regains active status.
        kdLogFormatMessage("Attempt to start a stale AL capture device");
        return;
    }
    cachedInData->started = TRUE;
}

ALC_API void ALC_APIENTRY alcCaptureStop(ALCdevice *device) {
    if (!cachedInData || (int)device != cachedInData->id) {
        // may happen after the app loses and regains active status.
        kdLogFormatMessage("Attempt to stop a stale AL capture device");
        return;
    }
    cachedInData->started = FALSE;
}

ALC_API ALCint ALC_APIENTRY alcGetAvailableSamples(ALCdevice *device) {
    if (!cachedInData || (int)device != cachedInData->id) {
        // may happen after the app loses and regains active status.
        kdLogFormatMessage("Attempt to get sample count from a stale AL capture device");
        return 0;
    }
    ALCint res;
    kdThreadMutexLock(cachedInData->mutex);
    res = cachedInData->bufFilledBytes / cachedInData->sampleSize;
    kdThreadMutexUnlock(cachedInData->mutex);
    return res;
}

ALC_API void ALC_APIENTRY alcCaptureSamples(ALCdevice *device, ALCvoid *buffer, ALCsizei samples) {    
    if (!cachedInData || (int)device != cachedInData->id) {
        // may happen after the app loses and regains active status.
        kdLogFormatMessage("Attempt to get samples from a stale AL capture device");
        return;
    }
    size_t bytesToCapture = samples * cachedInData->sampleSize;
    kdAssert(cachedInData->started);
    kdAssert(bytesToCapture <= cachedInData->bufFilledBytes);

    kdThreadMutexLock(cachedInData->mutex);
    memcpy(buffer, cachedInData->buf, bytesToCapture);
    memmove(cachedInData->buf, cachedInData->buf + bytesToCapture, cachedInData->bufFilledBytes - bytesToCapture);
    cachedInData->bufFilledBytes -= bytesToCapture;
    kdThreadMutexUnlock(cachedInData->mutex);
}

回答2:

I found a way to make Apple's OpenAL work. In my original code snippet, you need to call alcMakeContextCurrent(NULL) before alcDestroyContext(outputContext).

来源：https://stackoverflow.com/questions/10756972/sound-capture-with-openal-on-ios

标签

ios

openal