Media Foundation webcam video H264 encode/decode produces artifacts when played back

匿名 (未验证) 提交于 2019-12-03 03:08:02

问题:

I have a solution, where I encode video (YUY2) samples from a webcam with Media Foundation's h264 encoder. Then I send it via TCP connection to another application that decodes the stream with Media Foundation's h264 decoder back to YUY2 format. After decoding, the video samples/images are presented at the screen using DirectX.

The problem is that between key-frames the video image gets increasing amount of artifacts. Artifacts disappear when a key-frame is received.

I dropped the TCP connection out of the scope and did the decoding immediately after the encode, but still I have the artifacts bothering me.

Here's the callback method that receives the samples from the webcam:

//------------------------------------------------------------------- // OnReadSample // // Called when the IMFMediaSource::ReadSample method completes. //-------------------------------------------------------------------  HRESULT CPreview::OnReadSample(     HRESULT hrStatus,     DWORD /* dwStreamIndex */,     DWORD dwStreamFlags,     LONGLONG llTimestamp,     IMFSample *pSample      // Can be NULL     ) {     HRESULT hr = S_OK;     IMFMediaBuffer *pBuffer = NULL;      EnterCriticalSection(&m_critsec);      if (FAILED(hrStatus))     {         hr = hrStatus;     }      if (SUCCEEDED(hr))     {         if (pSample)         {             IMFSample *pEncodedSample = NULL;             hr = m_pCodec->EncodeSample(pSample, &pEncodedSample);             if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT || pEncodedSample == NULL)             {                 hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);                 LeaveCriticalSection(&m_critsec);                 return S_OK;             }              LONGLONG llEncodedSampleTimeStamp = 0;             LONGLONG llEncodedSampleDuration = 0;             pEncodedSample->GetSampleTime(&llEncodedSampleTimeStamp);             pEncodedSample->GetSampleDuration(&llEncodedSampleDuration);              pBuffer = NULL;             hr = pEncodedSample->GetBufferByIndex(0, &pBuffer);             if (hr != S_OK)             {                 hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);                 LeaveCriticalSection(&m_critsec);                 return hr;             }              BYTE *pOutBuffer = NULL;             DWORD dwMaxLength, dwCurrentLength;             hr = pBuffer->Lock(&pOutBuffer, &dwMaxLength, &dwCurrentLength);             if (hr != S_OK)             {                 hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);                 LeaveCriticalSection(&m_critsec);                 return hr;             }             // Send encoded webcam data to connected clients             //SendData(pOutBuffer, dwCurrentLength, llEncodedSampleTimeStamp, llEncodedSampleDuration);              pBuffer->Unlock();             SafeRelease(&pBuffer);              IMFSample *pDecodedSample = NULL;                        m_pCodec->DecodeSample(pEncodedSample, &pDecodedSample);             if (pDecodedSample != NULL)             {                 pDecodedSample->SetSampleTime(llTimestamp);                 pDecodedSample->SetSampleTime(llTimestamp - llLastSampleTimeStamp);                 llLastSampleTimeStamp = llTimestamp;                 hr = pDecodedSample->GetBufferByIndex(0, &pBuffer);                 //hr = pSample->GetBufferByIndex(0, &pBuffer);                  // Draw the frame.                 if (SUCCEEDED(hr))                 {                     hr = m_draw.DrawFrame(pBuffer);                 }                 SafeRelease(&pDecodedSample);             }              SafeRelease(&pBuffer);             SafeRelease(&pEncodedSample);                    }     }      // Request the next frame.     if (SUCCEEDED(hr))     {         hr = m_pReader->ReadSample(             (DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM,             0,             NULL,   // actual             NULL,   // flags             NULL,   // timestamp             NULL    // sample             );     }      if (FAILED(hr))     {         NotifyError(hr);     }     SafeRelease(&pBuffer);      LeaveCriticalSection(&m_critsec);     return hr; } 

And here's the encoder/decoder initialization code:

    HRESULT Codec::InitializeEncoder()     {            IMFMediaType *pMFTInputMediaType = NULL, *pMFTOutputMediaType = NULL;         IUnknown *spTransformUnk = NULL;             DWORD mftStatus = 0;         UINT8 blob[] = { 0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0xc0, 0x1e, 0x96, 0x54, 0x05, 0x01,             0xe9, 0x80, 0x80, 0x40, 0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x3c, 0x80 };          CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);         MFStartup(MF_VERSION);          // Create H.264 encoder.         CHECK_HR(CoCreateInstance(CLSID_CMSH264EncoderMFT, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&spTransformUnk), "Failed to create H264 encoder MFT.\n");          CHECK_HR(spTransformUnk->QueryInterface(IID_PPV_ARGS(&pEncoderTransform)), "Failed to get IMFTransform interface from H264 encoder MFT object.\n");          // Transform output type         MFCreateMediaType(&pMFTOutputMediaType);         pMFTOutputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);         pMFTOutputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);         pMFTOutputMediaType->SetUINT32(MF_MT_AVG_BITRATE, 500000);         CHECK_HR(MFSetAttributeSize(pMFTOutputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");         pMFTOutputMediaType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_MixedInterlaceOrProgressive);         pMFTOutputMediaType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);          // Special attributes for H264 transform, if needed         /*CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MPEG2_PROFILE, eAVEncH264VProfile_Base), "Failed to set profile on H264 MFT out type.\n");         CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MPEG2_LEVEL, eAVEncH264VLevel4), "Failed to set level on H264 MFT out type.\n");         CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MAX_KEYFRAME_SPACING, 10), "Failed to set key frame interval on H264 MFT out type.\n");         CHECK_HR(pMFTOutputMediaType->SetUINT32(CODECAPI_AVEncCommonQuality, 100), "Failed to set H264 codec qulaity.\n");         CHECK_HR(pMFTOutputMediaType->SetUINT32(CODECAPI_AVEncMPVGOPSize, 1), "Failed to set CODECAPI_AVEncMPVGOPSize = 1\n");*/         CHECK_HR(pEncoderTransform->SetOutputType(0, pMFTOutputMediaType, 0), "Failed to set output media type on H.264 encoder MFT.\n");          // Transform input type         MFCreateMediaType(&pMFTInputMediaType);         pMFTInputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);         pMFTInputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_YUY2);         CHECK_HR(MFSetAttributeSize(pMFTInputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");         CHECK_HR(pEncoderTransform->SetInputType(0, pMFTInputMediaType, 0), "Failed to set input media type on H.264 encoder MFT.\n");          CHECK_HR(pEncoderTransform->GetInputStatus(0, &mftStatus), "Failed to get input status from H.264 MFT.\n");         if (MFT_INPUT_STATUS_ACCEPT_DATA != mftStatus)         {             printf("E: pEncoderTransform->GetInputStatus() not accept data.\n");             goto done;         }          CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL), "Failed to process FLUSH command on H.264 MFT.\n");         CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL), "Failed to process BEGIN_STREAMING command on H.264 MFT.\n");         CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL), "Failed to process START_OF_STREAM command on H.264 MFT.\n");          return S_OK;      done:          SafeRelease(&pMFTInputMediaType);         SafeRelease(&pMFTOutputMediaType);          return S_FALSE;     }      HRESULT Codec::InitializeDecoder()     {         IUnknown *spTransformUnk = NULL;         IMFMediaType *pMFTOutputMediaType = NULL;         IMFMediaType *pMFTInputMediaType = NULL;         DWORD mftStatus = 0;          // Create H.264 decoder.         CHECK_HR(CoCreateInstance(CLSID_CMSH264DecoderMFT, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&spTransformUnk), "Failed to create H264 decoder MFT.\n");          // Query for the IMFTransform interface          CHECK_HR(spTransformUnk->QueryInterface(IID_PPV_ARGS(&pDecoderTransform)), "Failed to get IMFTransform interface from H264 decoder MFT object.\n");          // Create input mediatype for the decoder         MFCreateMediaType(&pMFTInputMediaType);         pMFTInputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);         pMFTInputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);         CHECK_HR(MFSetAttributeSize(pMFTInputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");         pMFTInputMediaType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_MixedInterlaceOrProgressive);         pMFTInputMediaType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);         CHECK_HR(pDecoderTransform->SetInputType(0, pMFTInputMediaType, 0), "Failed to set input media type on H.264 encoder MFT.\n");          CHECK_HR(pDecoderTransform->GetInputStatus(0, &mftStatus), "Failed to get input status from H.264 MFT.\n");         if (MFT_INPUT_STATUS_ACCEPT_DATA != mftStatus)         {             printf("E: pDecoderTransform->GetInputStatus() not accept data.\n");             goto done;         }          // Create outmedia type for the decoder         MFCreateMediaType(&pMFTOutputMediaType);         pMFTOutputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);         pMFTOutputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_YUY2);         CHECK_HR(MFSetAttributeSize(pMFTOutputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");         CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");         CHECK_HR(pDecoderTransform->SetOutputType(0, pMFTOutputMediaType, 0), "Failed to set output media type on H.264 decoder MFT.\n");          CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL), "Failed to process FLUSH command on H.264 MFT.\n");         CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL), "Failed to process BEGIN_STREAMING command on H.264 MFT.\n");         CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL), "Failed to process START_OF_STREAM command on H.264 MFT.\n");          return S_OK;      done:          SafeRelease(&pMFTInputMediaType);         SafeRelease(&pMFTOutputMediaType);          return S_FALSE;     } 

Here's the actual decode/encoder part:

HRESULT Codec::EncodeSample(IMFSample *pSample, IMFSample **ppEncodedSample) {     return TransformSample(pEncoderTransform, pSample, ppEncodedSample); }  HRESULT Codec::DecodeSample(IMFSample *pSample, IMFSample **ppEncodedSample) {     return TransformSample(pDecoderTransform, pSample, ppEncodedSample); }  HRESULT Codec::TransformSample(IMFTransform *pTransform, IMFSample *pSample, IMFSample **ppSampleOut) {     IMFSample *pOutSample = NULL;     IMFMediaBuffer *pBuffer = NULL;     DWORD mftOutFlags;     pTransform->ProcessInput(0, pSample, 0);     CHECK_HR(pTransform->GetOutputStatus(&mftOutFlags), "H264 MFT GetOutputStatus failed.\n");      // Note: Decoder does not return MFT flag MFT_OUTPUT_STATUS_SAMPLE_READY, so we just need to rely on S_OK return     if (pTransform == pEncoderTransform && mftOutFlags == S_OK)     {         return S_OK;     }     else if (pTransform == pEncoderTransform && mftOutFlags == MFT_OUTPUT_STATUS_SAMPLE_READY ||         pTransform == pDecoderTransform && mftOutFlags == S_OK)     {         DWORD processOutputStatus = 0;         MFT_OUTPUT_DATA_BUFFER outputDataBuffer;         MFT_OUTPUT_STREAM_INFO StreamInfo;         pTransform->GetOutputStreamInfo(0, &StreamInfo);          CHECK_HR(MFCreateSample(&pOutSample), "Failed to create MF sample.\n");         CHECK_HR(MFCreateMemoryBuffer(StreamInfo.cbSize, &pBuffer), "Failed to create memory buffer.\n");         if (pTransform == pEncoderTransform)             CHECK_HR(pBuffer->SetCurrentLength(StreamInfo.cbSize), "Failed SetCurrentLength.\n");         CHECK_HR(pOutSample->AddBuffer(pBuffer), "Failed to add sample to buffer.\n");               outputDataBuffer.dwStreamID = 0;         outputDataBuffer.dwStatus = 0;         outputDataBuffer.pEvents = NULL;         outputDataBuffer.pSample = pOutSample;          HRESULT hr = pTransform->ProcessOutput(0, 1, &outputDataBuffer, &processOutputStatus);         if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT)         {             SafeRelease(&pBuffer);             SafeRelease(&pOutSample);             return hr;         }          LONGLONG llVideoTimeStamp, llSampleDuration;         pSample->GetSampleTime(&llVideoTimeStamp);         pSample->GetSampleDuration(&llSampleDuration);         CHECK_HR(outputDataBuffer.pSample->SetSampleTime(llVideoTimeStamp), "Error setting MFT sample time.\n");         CHECK_HR(outputDataBuffer.pSample->SetSampleDuration(llSampleDuration), "Error setting MFT sample duration.\n");                 if (pTransform == pEncoderTransform)         {             IMFMediaBuffer *pMediaBuffer = NULL;             DWORD dwBufLength;             CHECK_HR(pOutSample->ConvertToContiguousBuffer(&pMediaBuffer), "ConvertToContiguousBuffer failed.\n");             CHECK_HR(pMediaBuffer->GetCurrentLength(&dwBufLength), "Get buffer length failed.\n");              WCHAR *strDebug = new WCHAR[256];             wsprintf(strDebug, L"Encoded sample ready: time %I64d, sample duration %I64d, sample size %i.\n", llVideoTimeStamp, llSampleDuration, dwBufLength);             OutputDebugString(strDebug);             SafeRelease(&pMediaBuffer);         }         else if (pTransform == pDecoderTransform)         {             IMFMediaBuffer *pMediaBuffer = NULL;             DWORD dwBufLength;             CHECK_HR(pOutSample->ConvertToContiguousBuffer(&pMediaBuffer), "ConvertToContiguousBuffer failed.\n");             CHECK_HR(pMediaBuffer->GetCurrentLength(&dwBufLength), "Get buffer length failed.\n");              WCHAR *strDebug = new WCHAR[256];             wsprintf(strDebug, L"Decoded sample ready: time %I64d, sample duration %I64d, sample size %i.\n", llVideoTimeStamp, llSampleDuration, dwBufLength);             OutputDebugString(strDebug);             SafeRelease(&pMediaBuffer);         }          // Decoded sample out         *ppSampleOut = pOutSample;          //SafeRelease(&pMediaBuffer);         SafeRelease(&pBuffer);          return S_OK;     }  done:     SafeRelease(&pBuffer);     SafeRelease(&pOutSample);      return S_FALSE; } 

I've searched solution for this for a quite a while now and found one question that is defined quite similarly as my issue, but as it was for a different API, it was no help to me. FFMPEG decoding artifacts between keyframes

Best Regards, Toni Riikonen

回答1:

I'm a little late to the game here but I can confirm that the answer from Homepage is the correct solution. I also encountered this same problem but I was only using the decoder portion of this sample code. I was reading in an MP4 file and saw the increasing artifacts between key frames. As soon as I received a key frame, the image looked good and then progressively got worse. Here is the code I added within the Codec::InitializeDecoder():

// Set CODECAPI_AVLowLatencyMode ICodecAPI *mpCodecAPI = NULL; hr = pDecoderTransform->QueryInterface(IID_PPV_ARGS(&mpCodecAPI)); CHECK_HR(hr, "Failed to get ICodecAPI.\n");  VARIANT var; var.vt = VT_BOOL; var.boolVal = VARIANT_TRUE; hr = mpCodecAPI->SetValue(&CODECAPI_AVLowLatencyMode, &var); CHECK_HR(hr, "Failed to enable low latency mode.\n"); 

After I added these changes the program worked much better! Thanks to this software on GitHub for giving me the necessary code: https://github.com/GameTechDev/ChatHeads/blob/master/VideoStreaming/EncodeTransform.cpp



回答2:

It sounds like quality/bitrate problem.

pMFTOutputMediaType->SetUINT32(MF_MT_AVG_BITRATE, 500000);  

500kbps is a too low value for the bitrate, you may try with something bigger like 5, 10 or 20Mbps.

I can suggest:

  1. Since you are creating the H264 encoder yourself, you can query it for ICodecAPI and try different settings. Namely, CODECAPI_AVEncCommonRateControlMode, CODECAPI_AVEncCommonQuality, CODECAPI_AVEncAdaptiveMode, CODECAPI_AVEncCommonQualityVsSpeed, CODECAPI_AVEncVideoEncodeQP.

  2. You might also try creating a hardware H264 encoder and use IMFDXGIDeviceManager with it (Windows 8 and above?)



回答3:

This question seems to have the answer but I still want to share my experience. Hope that can help who met the similar issue.

I also encountered the similar artifacts problem while decoding H264. However, in my case, the stream is from video capture device, and the artifacts do not disappear after 30-60 seconds from the start of the stream.

In my opinion, I guess the decoder with normal setting cannot decode the live streaming due to the low- latency. Thus, I try to enable the CODECAPI_AVLowLatencyMode which can set decode/encode mode to Low-latency for real-time communications or live capture. (To get more details, please refer the following link from MS https://msdn.microsoft.com/zh-tw/library/windows/desktop/hh447590(v=vs.85).aspx ) Fortunately, the problem has been solved and the decoder work normally.

Although there is a bit different of our problem, you can try to enable/disable CODECAPI_AVLowLatencyMode in your case, and I hope you can also have good news.



回答4:

It sounds like an IP(B) frame ordering problem.

Encoded frame order is not the same as decoded frame order. I did not test your code, but i think that the encoder provides frame in encoded order and that you need to reorder frame before rendering.



易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!