I implemented an encoder in 2 ways.
1) based on the SDK Transcoder example, which uses topology and transcoding profile
2) based on IMFSourceReader and IMFSi
I had initially ran into the same issue. You don't mention how you configured the output type of the source reader (and the input type of the sink), but I found that if you allow the system to handle it (by selecting the output type of the reader to be RGB32), the performance will be horrible and all CPU bound. (error checking omitted for brevity)
hr = videoMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
hr = videoMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32);
hr = reader->SetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, nullptr, videoMediaType);
reader->SetStreamSelection((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, true);
And the documentation agrees, indicating that this configuration is useful for getting a single snapshot from the video. As a result, if you configure the reader to deliver the native media type, performance is excellent, but you now have to transform the format yourself.
reader->GetNativeMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, videoMode->GetIndex(), videoMediaType);
From here, if you are dealing with simple color conversion (like YUY2 or YUV from a webcam) then there are a few options. I originally tried writing my own converter, and pushing that off to the GPU using HLSL
with DirectCompute
. This works very well, but in your case, the format isn't as trivial.
Ultimately, creating and configuring an instance of the color converter (as an IMFTransform
) works perfectly.
Microsoft::WRL::ComPtr<IMFMediaType> mediaTransform;
hr = ::CoCreateInstance(CLSID_CColorConvertDMO, nullptr, CLSCTX_INPROC_SERVER, __uuidof(IMFTransform), reinterpret_cast<void**>(mediaTransform.GetAddressOf());
// set the input type of the transform to the NATIVE output type of the reader
hr = mediaTransform->SetInputType(0u, videoMediaType.Get(), 0u);
Create and configure a separate sample and buffer.
IMFSample* transformSample;
hr = ::MFCreateSample(&transformSample);
hr = ::MFCreateMemoryBuffer(RGB_MFT_OUTPUT_BUFFER_SIZE, &_transformBuffer);
hr = transformSample->AddBuffer(transformBuffer);
MFT_OUTPUT_DATA_BUFFER* transformDataBuffer;
transformDataBuffer = new MFT_OUTPUT_DATA_BUFFER();
transformDataBuffer->pSample = _transformSample;
transformDataBuffer->dwStreamID = 0u;
transformDataBuffer->dwStatus = 0u;
transformDataBuffer->pEvents = nullptr;
When receiving samples from the source, hand them off to the transform to be converted.
hr = mediaTransform->ProcessInput(0u, sample, 0u));
hr = mediaTransform->ProcessOutput(0u, 1u, transformDataBuffer, &outStatus));
hr = transformDataBuffer->pSample->GetBufferByIndex(0, &mediaBuffer);
Then of course, finally hand off the transformed sample to the sink just as you do today. I am confident that this will work, and you will be a very happy person. For me, I went from 20% CPU utilization (originally implementation) down to 2% (I am concurrently displaying the video). Good luck. I hope you enjoy your project.