How to properly use a hardware accelerated Media Foundation Source Reader to decode a video?

前端 未结 2 859
暗喜
暗喜 2021-02-09 15:41

I\'m in the process of writing a hardware accelerated h264 decoder using Media Foundation\'s Source Reader, but have encountered a problem. I followed this tutorial and supporte

相关标签:
2条回答
  • 2021-02-09 16:16

    The output types of H264 video decoder can be found here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd797815(v=vs.85).aspx. RGB32 is not one of them. In this case your app relies on the Video Processor MFT to do the conversion from any of the MFVideoFormat_I420, MFVideoFormat_IYUV, MFVideoFormat_NV12, MFVideoFormat_YUY2, MFVideoFormat_YV12 to RGB32. I suppose that it's the Video Processor MFT that acts strangely and causes your program to misbehave. That's why by setting NV12 as the output subtype for the decoder you'll get rid of the Video Processor MFT and the following lines of code are getting useless as well:

    handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE));
    

    and

    handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, TRUE));
    

    Moreover, as you noticed NV12 is the only format that works properly. I think the reason for this is that it is the only one that is used in the accelerated scenarios by the D3D and DXGI device manager.

    0 讨论(0)
  • 2021-02-09 16:22

    Your code is correct, conceptually, with the only remark - and it's not quite obvious - that Media Foundation decoder is multithreaded. You are feeding it with a single threaded version of Direct3D device. You have to work it around or you get what you are currently getting: access violations and freezes, that is undefined behavior.

        // NOTE: No single threading
        handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, 
            (0 * D3D11_CREATE_DEVICE_SINGLETHREADED) | D3D11_CREATE_DEVICE_VIDEO_SUPPORT,
            levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr));
    
        // NOTE: Getting ready for multi-threaded operation
        const CComQIPtr<ID3D11Multithread> pMultithread = device;
        pMultithread->SetMultithreadProtected(TRUE);
    

    Also note that this straightforward code sample has a performance bottleneck around the lines you added for getting contiguous buffer. Apparently it's your move to get access to the data... however behavior by design is that decoded data is already in video memory, and your transfer to system memory is an expensive operation. That is, you added a severe performance hit to the loop. You will be interested in checking validity of data this way, and when it comes to performance benchmarking you should rather comment that out.

    0 讨论(0)
提交回复
热议问题