Calculate PTS before frame encoding in FFmpeg

后端 未结 3 1199
悲&欢浪女
悲&欢浪女 2020-12-24 09:05

How to calculate correct PTS value for frame before encoding in FFmpeg C API?

For encoding I\'m using function avcodec_encode_video2 and then writing i

相关标签:
3条回答
  • 2020-12-24 09:34

    It's better to think about PTS more abstractly before trying code.

    What you're doing is meshing 3 "time sets" together. The first is time we're used to, based on 1000 ms per second, 60 seconds per minute, and so on. The second is the codec time for the particular codec you are using. Each codec has a certain way it wants to represent time, usually in a 1/number format meaning that for every second there is "number" amount of ticks. The third format works similar to the second except that it is the time base for the container that you are used.

    Some people prefer to start with actual time, others frame count, neither is "wrong".

    Starting with a frame count you need to first convert it based on your frame rate. Note all conversions I speak of use av_rescale_q(...). The purpose of this conversion is to turn a counter into time, so you rescale with your frame rate (video steam time base usually). Then you have to convert that into the time_base of your video codec before encoding.

    Similarly, with a real time, your first conversion needs to be from current_time - start_time scaled to your video codec time.

    Anyone using only frame counter is probably using a codec with a time_base equal to their frame rate. Most codecs do not work like this and their hack is not portable. Example:

    frame->pts = videoCodecCtx->frame_number;  // BAD
    

    Additionally, anyone using hardcoded numbers in their av_rescale_q is leveraging the fact that they know what their time_base is and this should be avoided. The code isn't portable to other video formats. Instead use video_st->time_base, video_st->codec->time_base, and output_ctx->time_base to figure things out.

    I hope understanding it from a higher level will help you see which of those are "correct" and which are "bad practice". There is no single answer, but maybe now you can decide which approach is best for you.

    0 讨论(0)
  • 2020-12-24 09:36

    There's also the option with setting it like frame->pts = av_frame_get_best_effort_timestamp(frame) but I'm not sure this is the correct approach either.

    0 讨论(0)
  • 2020-12-24 09:43

    Time is measured not in seconds or milliseconds or any standard unit. Instead, it is measured by the avCodecContext's timebase.

    So if you set the codecContext->time_base to 1/1, it means using second for measurement.

    cctx->time_base = (AVRational){1, 1};
    

    Assuming you want to encode at a steady fps of 30. Then, the time when a frame is encoded is framenumber * (1.0/fps)

    But once again, the PTS is also not measured in seconds or any standard unit. It's measured by avStream's time_base.

    In the question, the author mentioned 90k as the standard resolution for pts. But you will see that this is not always true. The exact resolution is saved in avstream. you can read it back by:

        if ((err = avformat_write_header(ofctx, NULL)) < 0) {
            std::cout << "Failed to write header" << err << std::endl;
            return -1;
        }
    
        av_dump_format(ofctx, 0, "test.webm", 1);
        std::cout << stream->time_base.den  << " " << stream->time_base.num << std::endl;
    

    The value of stream->time_stamp is only populated after calling avformat_write_header

    Therefore, the right formula for calculating PTS is:

    //The following assumes that codecContext->time_base = (AVRational){1, 1};
    videoFrame->pts = frameduration * (frameCounter++) * stream->time_base.den / (stream->time_base.num * fps);
    

    So really there are 3 components in the formula,

    1. fps
    2. codecContext->time_base
    3. stream->time_base

    so pts = fps*codecContext->time_base/stream->time_base

    I have detailed my discovery here

    0 讨论(0)
提交回复
热议问题