Broken output from libavcodec/swscale, depending on resolution

问题

I am writing a video conference software, I have a H.264 stream decoded with libavcoded into IYUV and than rendered into a window with VMR9 in windowless mode. I use a DirectShow graph to do so.

To avoid unnecessary conversion into RGB and back (see link), I convert IYUV video into YUY2 before passing it to VMR9, with libswscale.

I noticed that with video resolution of 848x480, output video is broken, so I investigated further and came up that for some resolutions video is always broken. To exclude the libswscale from elaboration, I added support for IYUV+padding to IYUV conversion, and it worked, with all resolutions.

Still, I was willing to avoid slow IYUV, so I implemented support for NV12 (with libswscale) and YV12 (manually, essentially the same as IYUV). After doing some tests on two different computers, I came up with strange results.

resolution  YUY2    NV12    IYUV    YV12
PC 1 (my laptop)                
640x360     ok      broken  ok      broken
848x480     broken  broken  ok      broken
960x540     broken  broken  ok      broken
1024x576    ok      ok      ok      ok
1280x720    ok      ok      ok      broken
1920x1080   ok      broken  ok      broken

PC 2                
640x360     ok      ok      ok      ok
848x480     ok      broken  ok      broken
960x540     ok      ok      ok      ok
1024x576    ok      ok      ok      ok
1280x720    ok      broken  ok      ok
1920x1080   ok      ok      ok      ok

To exclude VMR9 fault, I substituted it with EVR, but with same results.

I know that padding is needed for memory alignment, and that the size of padding depends on CPU used (libavcodec doc), that may explain difference between two computers(first has Intel i7-3820QM, the second Intel Core 2 Quad Q6600). I suppose it has something to do with padding, because images are corrupted in certain way.

You can see my blue t-shirt in lower part of image, and my face in the upper one.

To follow is the code for the conversion. NV12 and YUY2 conversions are performed with libswscale, while IYUV and YV12 manually.

int pixels = _outputFrame->width * _outputFrame->height;
if (_outputFormat == "YUY2") {
    int stride = _outputFrame->width * 2;
    sws_scale(_convertCtx, _outputFrame->data, _outputFrame->linesize, 0, _outputFrame->height, &out, &stride);
}
else if (_outputFormat == "NV12") {
    int stride[] = { _outputFrame->width, _outputFrame->width };
    uint8_t * dst[] = { out, out + pixels };
    sws_scale(_convertCtx, _outputFrame->data, _outputFrame->linesize, 0, _outputFrame->height, dst, stride);
}
else if (_outputFormat == "IYUV") { // clean ffmpeg padding
    for (int i = 0; i < _outputFrame->height; i++) // copy Y
        memcpy(out + i * _outputFrame->width, _outputFrame->data[0] + i * _outputFrame->linesize[0] , _outputFrame->width);
    for (int i = 0; i < _outputFrame->height / 2; i++) // copy U
        memcpy(out + pixels + i * _outputFrame->width / 2, _outputFrame->data[1] + i * _outputFrame->linesize[1] , _outputFrame->width / 2);            
    for (int i = 0; i < _outputFrame->height / 2; i++) // copy V
        memcpy(out + pixels + pixels/4 + i * _outputFrame->width / 2, _outputFrame->data[2] + i * _outputFrame->linesize[2] , _outputFrame->width / 2);
} 
else if (_outputFormat == "YV12") { // like IYUV, but U is inverted with V plane
    for (int i = 0; i < _outputFrame->height; i++) // copy Y
        memcpy(out + i * _outputFrame->width, _outputFrame->data[0] + i * _outputFrame->linesize[0], _outputFrame->width);
    for (int i = 0; i < _outputFrame->height / 2; i++) // copy V
        memcpy(out + pixels + i * _outputFrame->width / 2, _outputFrame->data[2] + i * _outputFrame->linesize[2], _outputFrame->width / 2);
    for (int i = 0; i < _outputFrame->height / 2; i++) // copy U
        memcpy(out + pixels + pixels / 4 + i * _outputFrame->width / 2, _outputFrame->data[1] + i * _outputFrame->linesize[1], _outputFrame->width / 2);
}

out is an output buffer. _outputFrame is libavcodec output AVFrame. _convertCtx is initialized as follows.

if (_outputFormat == "YUY2")
    _convertCtx = sws_getContext(_width, _height, AV_PIX_FMT_YUV420P,
                                 _width, _height, AV_PIX_FMT_YUYV422, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);
else if (_outputFormat == "NV12")
    _convertCtx = sws_getContext(_width, _height, AV_PIX_FMT_YUV420P,
                                 _width, _height, AV_PIX_FMT_NV12, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);

Questions:

Are manual conversions correct?
Are my assumptions correct?
Is previous two answers are positive, where is the problem? And especially...
Why it presents only with some resolutions and not others?
What additional info can I provide?

来源：https://stackoverflow.com/questions/24023636/broken-output-from-libavcodec-swscale-depending-on-resolution

标签

c++

video

ffmpeg

libavcodec