问题
I am writing a video conference software, I have a H.264 stream decoded with libavcoded into IYUV and than rendered into a window with VMR9 in windowless mode. I use a DirectShow graph to do so.
To avoid unnecessary conversion into RGB and back (see link), I convert IYUV video into YUY2 before passing it to VMR9, with libswscale.
I noticed that with video resolution of 848x480, output video is broken, so I investigated further and came up that for some resolutions video is always broken. To exclude the libswscale from elaboration, I added support for IYUV+padding to IYUV conversion, and it worked, with all resolutions.
Still, I was willing to avoid slow IYUV, so I implemented support for NV12 (with libswscale) and YV12 (manually, essentially the same as IYUV). After doing some tests on two different computers, I came up with strange results.
resolution YUY2 NV12 IYUV YV12
PC 1 (my laptop)
640x360 ok broken ok broken
848x480 broken broken ok broken
960x540 broken broken ok broken
1024x576 ok ok ok ok
1280x720 ok ok ok broken
1920x1080 ok broken ok broken
PC 2
640x360 ok ok ok ok
848x480 ok broken ok broken
960x540 ok ok ok ok
1024x576 ok ok ok ok
1280x720 ok broken ok ok
1920x1080 ok ok ok ok
To exclude VMR9 fault, I substituted it with EVR, but with same results.
I know that padding is needed for memory alignment, and that the size of padding depends on CPU used (libavcodec doc), that may explain difference between two computers(first has Intel i7-3820QM, the second Intel Core 2 Quad Q6600). I suppose it has something to do with padding, because images are corrupted in certain way.
You can see my blue t-shirt in lower part of image, and my face in the upper one.To follow is the code for the conversion. NV12 and YUY2 conversions are performed with libswscale, while IYUV and YV12 manually.
int pixels = _outputFrame->width * _outputFrame->height;
if (_outputFormat == "YUY2") {
int stride = _outputFrame->width * 2;
sws_scale(_convertCtx, _outputFrame->data, _outputFrame->linesize, 0, _outputFrame->height, &out, &stride);
}
else if (_outputFormat == "NV12") {
int stride[] = { _outputFrame->width, _outputFrame->width };
uint8_t * dst[] = { out, out + pixels };
sws_scale(_convertCtx, _outputFrame->data, _outputFrame->linesize, 0, _outputFrame->height, dst, stride);
}
else if (_outputFormat == "IYUV") { // clean ffmpeg padding
for (int i = 0; i < _outputFrame->height; i++) // copy Y
memcpy(out + i * _outputFrame->width, _outputFrame->data[0] + i * _outputFrame->linesize[0] , _outputFrame->width);
for (int i = 0; i < _outputFrame->height / 2; i++) // copy U
memcpy(out + pixels + i * _outputFrame->width / 2, _outputFrame->data[1] + i * _outputFrame->linesize[1] , _outputFrame->width / 2);
for (int i = 0; i < _outputFrame->height / 2; i++) // copy V
memcpy(out + pixels + pixels/4 + i * _outputFrame->width / 2, _outputFrame->data[2] + i * _outputFrame->linesize[2] , _outputFrame->width / 2);
}
else if (_outputFormat == "YV12") { // like IYUV, but U is inverted with V plane
for (int i = 0; i < _outputFrame->height; i++) // copy Y
memcpy(out + i * _outputFrame->width, _outputFrame->data[0] + i * _outputFrame->linesize[0], _outputFrame->width);
for (int i = 0; i < _outputFrame->height / 2; i++) // copy V
memcpy(out + pixels + i * _outputFrame->width / 2, _outputFrame->data[2] + i * _outputFrame->linesize[2], _outputFrame->width / 2);
for (int i = 0; i < _outputFrame->height / 2; i++) // copy U
memcpy(out + pixels + pixels / 4 + i * _outputFrame->width / 2, _outputFrame->data[1] + i * _outputFrame->linesize[1], _outputFrame->width / 2);
}
out
is an output buffer. _outputFrame
is libavcodec output AVFrame. _convertCtx
is initialized as follows.
if (_outputFormat == "YUY2")
_convertCtx = sws_getContext(_width, _height, AV_PIX_FMT_YUV420P,
_width, _height, AV_PIX_FMT_YUYV422, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);
else if (_outputFormat == "NV12")
_convertCtx = sws_getContext(_width, _height, AV_PIX_FMT_YUV420P,
_width, _height, AV_PIX_FMT_NV12, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);
Questions:
- Are manual conversions correct?
- Are my assumptions correct?
- Is previous two answers are positive, where is the problem? And especially...
- Why it presents only with some resolutions and not others?
- What additional info can I provide?
来源:https://stackoverflow.com/questions/24023636/broken-output-from-libavcodec-swscale-depending-on-resolution