I am trying to mux H264 encoded data and G711 PCM data into mov
multimedia container. I am creating AVPacket
from encoded data and initially the PTS an
Timestamps (such as dts) should be in AVStream.time_base units. You're requesting a video timebase of 1/90000 and a default audio timebase (1/9000), but you're using a timebase of 1/100000 to write dts values. I'm also not sure if it's guaranteed that requested timebases are maintained during header writing, your muxer might change the values and expect you to deal with the new values.
So code like this:
int64_t dts = av_gettime(); dts = av_rescale_q(dts, (AVRational){1, 1000000}, (AVRational){1, 90000}); int duration = AUDIO_STREAM_DURATION; // 20 if(m_prevAudioDts > 0LL) { duration = dts - m_prevAudioDts; }
Won't work. Change that to something that uses the audiostream's timebase, and don't set the duration unless you know what you're doing. (Same for video.)
m_prevAudioDts = dts; pkt.pts = AV_NOPTS_VALUE; pkt.dts = m_currAudioDts; m_currAudioDts += duration; pkt.duration = duration;
This looks creepy, especially combined with the video alike code. The problem here is that the first packet for both will have a timestamp of zero, regardless of inter-packet delay between the streams. You need one parent currDts shared between all streams, otherwise your streams will be perpetually out of sync.
[edit]
So, regarding your edit, if you have audio gaps, I think you need to insert silence (zeroed audio sample data) for the duration of the gap.