Seamless audio recording while flipping camera, using AVCaptureSession & AVAssetWriter

I’m looking for a way to maintain a seamless audio track while flipping between front and back camera. Many apps in the market can do this, one example is SnapChat…

Solutions should use AVCaptureSession and AVAssetWriter. Also it should explicitly not use AVMutableComposition since there is a bug between AVMutableComposition and AVCaptureSession ATM. Also, I can't afford post processing time.

Currently when I change the video input the audio recording skips and becomes out of sync.

I’m including the code that could be relevant.

Flip Camera

-(void) updateCameraDirection:(CamDirection)vCameraDirection {
    if(session) {
        AVCaptureDeviceInput* currentInput;
        AVCaptureDeviceInput* newInput;
        BOOL videoMirrored = NO;
        switch (vCameraDirection) {
            case CamDirection_Front:
                currentInput = input_Back;
                newInput = input_Front;
                videoMirrored = NO;
            case CamDirection_Back:
                currentInput = input_Front;
                newInput = input_Back;
                videoMirrored = YES;

        [session beginConfiguration];
        //disconnect old input
        [session removeInput:currentInput];
        //connect new input
        [session addInput:newInput];
        //get new data connection and config
        dataOutputVideoConnection = [dataOutputVideo connectionWithMediaType:AVMediaTypeVideo];
        dataOutputVideoConnection.videoOrientation = AVCaptureVideoOrientationPortrait;
        dataOutputVideoConnection.videoMirrored = videoMirrored;
        [session commitConfiguration];

Sample Buffer

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
    //not active

    //start session if not started
    if(!startedSession) {
        startedSession = YES;
        [assetWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];

    //Process sample buffers
    if (connection == dataOutputAudioConnection) {
        if([assetWriterInputAudio isReadyForMoreMediaData]) {
            BOOL success = [assetWriterInputAudio appendSampleBuffer:sampleBuffer];

    } else if (connection == dataOutputVideoConnection) {
        if([assetWriterInputVideo isReadyForMoreMediaData]) {        
            BOOL success = [assetWriterInputVideo appendSampleBuffer:sampleBuffer];

Perhaps adjust audio sample timeStamp?


Hey I was facing the same issue and discovered that after switching cameras the next frame was pushed far out of place. This seemed to shift every frame after that thus causing the the video and audio to be out of sync. My solution was to shift every misplaced frame to it's correct position after switching cameras.

Sorry my answer will be in Swift 4.2

You'll have to use AVAssetWriterInputPixelBufferAdaptor in order to append the sample buffers at a specify presentation timestamp.

previousPresentationTimeStamp is the presentation timestamp of the previous frame and currentPresentationTimestamp is as you guessed the presentation timestamp of the current. maxFrameDistance worked every well when testing but you can change this to your liking.

let currentFramePosition = (Double(self.frameRate) * Double(currentPresentationTimestamp.value)) / Double(currentPresentationTimestamp.timescale)
let previousFramePosition = (Double(self.frameRate) * Double(previousPresentationTimeStamp.value)) / Double(previousPresentationTimeStamp.timescale)
var presentationTimeStamp = currentPresentationTimestamp
let maxFrameDistance = 1.1
let frameDistance = currentFramePosition - previousFramePosition
if frameDistance > maxFrameDistance {
    let expectedFramePosition = previousFramePosition + 1.0
    //print("[mwCamera]: Frame at incorrect position moving from \(currentFramePosition) to \(expectedFramePosition)")

    let newFramePosition = ((expectedFramePosition) * Double(currentPresentationTimestamp.timescale)) / Double(self.frameRate)

    let newPresentationTimeStamp = CMTime.init(value: CMTimeValue(newFramePosition), timescale: currentPresentationTimestamp.timescale)

    presentationTimeStamp = newPresentationTimeStamp

let success = assetWriterInputPixelBufferAdator.append(pixelBuffer, withPresentationTime: presentationTimeStamp)
if !success, let error = assetWriter.error {

Also please note - This worked because I kept the frame rate consistent, so make sure that you have total control of the capture device's frame rate throughout this process.

I have a repo using this logic here


I did manage to find an intermediate solution for the sync problem I found on the Woody Jean-louis solution using is repo.

The results are similar to what instagram does but it seems to work a little bit better. Basically what I do is to prevent the assetWriterAudioInput to append new samples when switching cameras. There is no way to know exactly when this happens so I figured out that before and after the switch the captureOutput method was sending video samples every 0.02 seconds +- (max 0.04 seconds).

Knowing this I created a self.lastVideoSampleDate that is updated every time a video sample is appended to assetWriterInputPixelBufferAdator and I only allow the audio sample to be appended to assetWriterAudioInput is that date is lower than 0.05.

 if let assetWriterAudioInput = self.assetWriterAudioInput,
            output == self.audioOutput, assetWriterAudioInput.isReadyForMoreMediaData {

            let since = Date().timeIntervalSince(self.lastVideoSampleDate)
            if since < 0.05 {
                let success = assetWriterAudioInput.append(sampleBuffer)
                if !success, let error = assetWriter.error {
  let success = assetWriterInputPixelBufferAdator.append(pixelBuffer, withPresentationTime: presentationTimeStamp)
            if !success, let error = assetWriter.error {
            self.lastVideoSampleDate = Date()

