I am receiving video H264 encoded data and audio G.711 PCM encoded data from two different threads to mux / write into mov
multimedia container.
The writer function signatures are like:
bool WriteAudio(const unsigned char *pEncodedData, size_t iLength);
bool WriteVideo(const unsigned char *pEncodedData, size_t iLength, bool const bIFrame);
And the function for adding audio and video streams looks like:
AVStream* AudioVideoRecorder::AddMediaStream(enum AVCodecID codecID) {
Log("Adding stream: %s.", avcodec_get_name(codecID));
AVCodecContext* pCodecCtx;
AVStream* pStream;
/* find the encoder */
AVCodec* codec = avcodec_find_encoder(codecID);
if (!codec) {
LogErr("Could not find encoder for %s", avcodec_get_name(codecID));
return NULL;
}
pStream = avformat_new_stream(m_pFormatCtx, codec);
if (!pStream) {
LogErr("Could not allocate stream.");
return NULL;
}
pStream->id = m_pFormatCtx->nb_streams - 1;
pStream->time_base = (AVRational){1, VIDEO_FRAME_RATE};
pCodecCtx = pStream->codec;
switch(codec->type) {
case AVMEDIA_TYPE_VIDEO:
pCodecCtx->codec_id = codecID;
pCodecCtx->bit_rate = VIDEO_BIT_RATE;
pCodecCtx->width = PICTURE_WIDTH;
pCodecCtx->height = PICTURE_HEIGHT;
pCodecCtx->gop_size = VIDEO_FRAME_RATE;
pCodecCtx->pix_fmt = PIX_FMT_YUV420P;
m_pVideoStream = pStream;
break;
case AVMEDIA_TYPE_AUDIO:
pCodecCtx->codec_id = codecID;
pCodecCtx->sample_fmt = AV_SAMPLE_FMT_S16;
pCodecCtx->bit_rate = 64000;
pCodecCtx->sample_rate = 8000;
pCodecCtx->channels = 1;
m_pAudioStream = pStream;
break;
default:
break;
}
/* Some formats want stream headers to be separate. */
if (m_pOutputFmt->flags & AVFMT_GLOBALHEADER)
m_pFormatCtx->flags |= CODEC_FLAG_GLOBAL_HEADER;
return pStream;
}
Inside WriteAudio(..)
and WriteVideo(..)
functions, I am creating AVPakcet
using av_init_packet(...)
and set pEncodedData
and iLength
as packet.data
and packet.size
. I printed packet.pts
and packet.dts
and its equivalent to AV_NOPTS_VALUE
.
Now, how do I calculate the PTS, DTS, and packet duration (packet.dts
, packet.pts
and packet.duration
) correctly for both audio and video data so that I can sync audio & video and play it properly? I saw many examples on the internet, but none of them are making sense to me. I am new with ffmpeg
, and my conception may not be correct in some context. I want to do it in the appropriate way.
Thanks in advance!
EDIT: In my video streams, there is no B frame. So, I think PTS and DTS can be kept the same here.
PTS/DTS are timestamps, they should be set to the timestamps of the input data. I don't know where your date comes from, but any input has some form of timestamps associated with it. Typically, the timestamps of the input media file or a system clock-derived metric if you're recording from your soundcard+webcam, and so on. You should convert these numbers into the form expected, and then assign them to AVPacket.pts/dts
.
来源:https://stackoverflow.com/questions/31917032/pts-and-dts-calculation-for-video-and-audio-frames