问题
Currently, I'm doing a little test project to see if I can get samples from an AVAssetReader to play back using an AudioQueue on iOS.
I've read this: ( Play raw uncompressed sound with AudioQueue, no sound ) and this: ( How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding ),
Which both actually did help. Before reading, I was getting no sound at all. Now, I'm getting sound, but the audio is playing SUPER fast. This is my first foray into audio programming, so any help is greatly appreciated.
I initialize the reader thusly:
NSDictionary * outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
output = [[AVAssetReaderAudioMixOutput alloc] initWithAudioTracks:uasset.tracks audioSettings:outputSettings];
[reader addOutput:output];
...
And I grab the data thusly:
CMSampleBufferRef ref= [output copyNextSampleBuffer];
// NSLog(@"%@",ref);
if(ref==NULL)
return;
//copy data to file
//read next one
AudioBufferList audioBufferList;
NSMutableData *data = [NSMutableData data];
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(ref, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
// NSLog(@"%@",blockBuffer);
if(blockBuffer==NULL)
{
[data release];
return;
}
if(&audioBufferList==NULL)
{
[data release];
return;
}
//stash data in same object
for( int y=0; y<audioBufferList.mNumberBuffers; y++ )
{
// NSData* throwData;
AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
[self.delegate streamer:self didGetAudioBuffer:audioBuffer];
/*
Float32 *frame = (Float32*)audioBuffer.mData;
throwData = [NSData dataWithBytes:audioBuffer.mData length:audioBuffer.mDataByteSize];
[self.delegate streamer:self didGetAudioBuffer:throwData];
[data appendBytes:audioBuffer.mData length:audioBuffer.mDataByteSize];
*/
}
which eventually brings us to the audio queue, set up in this way:
//Apple's own code for canonical PCM
audioDesc.mSampleRate = 44100.0;
audioDesc.mFormatID = kAudioFormatLinearPCM;
audioDesc.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
audioDesc.mBytesPerPacket = 2 * sizeof (AudioUnitSampleType); // 8
audioDesc.mFramesPerPacket = 1;
audioDesc.mBytesPerFrame = 1 * sizeof (AudioUnitSampleType); // 8
audioDesc.mChannelsPerFrame = 2;
audioDesc.mBitsPerChannel = 8 * sizeof (AudioUnitSampleType); // 32
err = AudioQueueNewOutput(&audioDesc, handler_OSStreamingAudio_queueOutput, self, NULL, NULL, 0, &audioQueue);
if(err){
#pragma warning handle error
//never errs, am using breakpoint to check
return;
}
and we enqueue thusly
while (inNumberBytes)
{
size_t bufSpaceRemaining = kAQDefaultBufSize - bytesFilled;
if (bufSpaceRemaining < inNumberBytes)
{
AudioQueueBufferRef fillBuf = audioQueueBuffer[fillBufferIndex];
fillBuf->mAudioDataByteSize = bytesFilled;
err = AudioQueueEnqueueBuffer(audioQueue, fillBuf, 0, NULL);
}
bufSpaceRemaining = kAQDefaultBufSize - bytesFilled;
size_t copySize;
if (bufSpaceRemaining < inNumberBytes)
{
copySize = bufSpaceRemaining;
}
else
{
copySize = inNumberBytes;
}
if (bytesFilled > packetBufferSize)
{
return;
}
AudioQueueBufferRef fillBuf = audioQueueBuffer[fillBufferIndex];
memcpy((char*)fillBuf->mAudioData + bytesFilled, (const char*)(inInputData + offset), copySize);
bytesFilled += copySize;
packetsFilled = 0;
inNumberBytes -= copySize;
offset += copySize;
}
}
I tried to be as code inclusive as possible so as to make it easy for everyone to point out where I'm being a moron. That being said, I have a feeling my problem occurs either in the output settings declaration of the track reader or in the actual declaration of the AudioQueue (where I describe to the queue what kind of audio I'm going to be sending it). The fact of the matter is, I don't really know mathematically how to actually generate those numbers (bytes per packet, frames per packet, what have you). An explanation of that would be greatly appreciated, and thanks for the help in advance.
回答1:
Not sure how much of an answer this is, but there will be too much text and links for a comment and hopefully it will help (maybe guide you to your answer).
First off I know with my current project adjusting the sample rate will effect the speed of the sound, so you can try to play with those settings. But 44k is what I see in most default implementation including the apple example SpeakHere. However I would spend some time comparing your code to that example because there are quite a few differences. like checking before enqueueing.
First check out this posting https://stackoverflow.com/a/4299665/530933 It talks about how you need to know the audio format, specifically how many bytes in a frame, and casting appropriately
also good luck. I have had quite a few questions posted here, apple forums, and the ios forum (not the official one). With very little responses/help. To get where I am today (audio recording & streaming in ulaw) I ended up having to open an Apple Dev Support Ticket. Which prior to tackling the audio I never knew existed (dev support). One good thing is that if you have a valid dev account you get 2 incidents for free! CoreAudio is not fun. Documentation is sparse, and besides SpeakHere there are not many examples. One thing I did find is that the framework headers do have some good info and this book. Unfortunately I have only started the book otherwise I may be able to help you further.
You can also check some of my own postings which I have tried to answer to the best of my abilities. This is my main audio question which I have spent alot of time on to compile all pertinent links and code.
using AQRecorder (audioqueue recorder example) in an objective c class
trying to use AVAssetWriter for ulaw audio (2)
回答2:
For some reason, even though every example I've seen of the audio queue using LPCM had
ASBD.mBitsPerChannel = 8* sizeof (AudioUnitSampleType);
For me it turns out I needed
ASBD.mBitsPerChannel = 2*bytesPerSample;
for a description of:
ASBD.mFormatID = kAudioFormatLinearPCM;
ASBD.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
ASBD.mBytesPerPacket = bytesPerSample;
ASBD.mBytesPerFrame = bytesPerSample;
ASBD.mFramesPerPacket = 1;
ASBD.mBitsPerChannel = 2*bytesPerSample;
ASBD.mChannelsPerFrame = 2;
ASBD.mSampleRate = 48000;
I have no idea why this works, which bothers me a great deal... but hopefully I can figure it all out eventually.
If anyone can explain to me why this works, I'd be very thankful.
来源:https://stackoverflow.com/questions/11398997/avassetreader-to-audioqueuebuffer