I\'m trying to record segments of audio and recombine them without producing a gap in audio.
The eventual goal is to also have video, but I\'ve found that audio itself c
With packet formats like AAC you have silent priming frames (a.k.a encoder delay) at the beginning and remainder frames at the end (when your audio length is not a multiple of the packet size). In your case it's 2112 of them at the beginning of every file. Priming and remainder frames break the possibility of concatenating the files without transcoding them, so you can't really blame ffmpeg -c copy
for not producing seamless output.
I'm not sure where this leaves you with video - obviously audio is synced to the video, even in the presence of priming frames.
It all depends on how you intend to concatenate the final audio (and eventually video). If you're doing it yourself using AVFoundation
, then you can detect and account for priming/remainder frames using
CMGetAttachment(buffer, kCMSampleBufferAttachmentKey_TrimDurationAtStart, NULL)
CMGetAttachment(audioBuffer, kCMSampleBufferAttachmentKey_TrimDurationAtEnd, NULL)
As a short term solution, you can switch to a non "packetised" to get gapless, concatenatable (with ffmpeg) files.
e.g.
AVFormatIDKey: kAudioFormatAppleIMA4
, fileType: AVFileTypeAIFC
, suffix ".aifc" or
AVFormatIDKey: kAudioFormatLinearPCM
, fileType: AVFileTypeWAVE
, suffix ".wav"
p.s. you can see priming & remainder frames and packet sizes using the ubiquitous afinfo
tool.
afinfo chunk.mp4
Data format: 2 ch, 44100 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
...
audio 39596 valid frames + 2112 priming + 276 remainder = 41984
...