Webcam MJPG capture streams are unavailable on Windows 10

前端 未结 2 1870
春和景丽
春和景丽 2021-01-15 04:48

On Windows 10 build 10.1607.14393.10 (aka Anniversary edition) I am unable to get MJPG capture streams anymore. Used to be both MJPG and YUY2 resolutions, now I am getting o

相关标签:
2条回答
  • 2021-01-15 05:12

    As explained by Mike M from Microsoft,

    So yes, MJPEG and H.264 being decoded / filtered out is the result of a set of features we needed to implement, and this behavior was planned, designed, tested, and flighted out to our partners and Windows Insiders around the end of January of this year. We worked with partners to make sure their applications continued to function throughout this change, but we have done a poor job communicating this change out to you guys. We dropped the ball on that front, so I’d like to offer my apologies to you all.

    In Windows 10 Anniversary Update MJPG video from webcam is captured by new helper service "Windows Camera Frame Server", which is self-introducing itself as "Enables multiple clients to access video frames from camera devices.". The same is mentioned by Mike M.

    I for one was unable to see multiple clients sharing a camera as second instance of TopoEdit gave me typical error: Error starting playback. Hardware MFT failed to start streaming due to lack of hardware resources.

    MJPG and H264 media types however are indeed filtered out as the platform update now claims responsibility to avoid scenarios where multiple clients access the same camera simultaneously and each one does decoding on its own duplicating the effort.

    One of the main reasons that Windows is decoding MJPEG for your applications is because of performance. With the Anniversary Update to Windows 10, it is now possible for multiple applications to access the camera in ways that weren’t possible before. It was important for us to enable concurrent camera access, so Windows Hello, Microsoft Hololens and other products and features could reliably assume that the camera would be available at any given time, regardless of what other applications may be accessing it. One of the reasons this led to the MJPEG decoding is because we wanted to prevent multiple applications from decoding the same stream at the same time, which would be a duplicated effort and thus an unnecessary performance hit.

    Apparently this "improvement" caught many by surprise.

    UPDATE. It was detected that behavior to use new Frame Server feature can be disabled system wide by creating a registry value as defined below. Once Media Foundation API sees this value it chooses an original code path to talk to "hardware" (KS proxy) directly bypassing Frame Server.

    • Key Name:
      • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows Media Foundation\Platform (64-bit apps; 32-bit apps in 32-bit OS)
      • HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Windows Media Foundation\Platform (32-bit apps in 64-bit OS)
    • Value Name: "EnableFrameServerMode" REG_DWORD
    • Value: 0
    0 讨论(0)
  • 2021-01-15 05:20

    Since this is the answer let me state first that several workarounds exists ranging from hacks to expensive development

    1. (hack) take old mfcore.dll from original Windows 10, delay link with it and force load your local copy - this is hack don't try it at home don't ship it.
    2. Use poorly documented ksproxy.ax or it`s modern replacement mfkproxy to implement your own layer to talk to the cameras.
    3. Switch cameras to WinUSB and use libusb/libuvc (code in not that high performance and not that mature on Windows) and implement your own camera interface

    Now to the proper design of "frame server":

    We also have frame server in our zSpace system design that servers decompressed images, compressed cameras feed (four of them at almost 1Gpixel per second total), blob detection information and 3D poses triangulation results to multiple clients (applications including remote) at the same time. Whole thing using shared memory and/or sockets is just few hundred of lines of straight C code. I've implemented it and it works on Windows and Linux.

    The deficiency of the Microsoft "improvement" lays in ignorance to the client needs and I believe is easy to fix.

    For the sake of the argument let's assume camera streams compressed format (could be MJPG/H.26x/HEVC/something new and better).

    Let say there are several possible classes of clients:

    1. Client that streams the raw compressed data to the network for remote hosts (do we want transcoding there?).
    2. Client that stores the raw compressed data in the local or remote persistent storage (hard-drive, ssd). (do we want transcoding there?)
    3. Client that does raw compressed data stream analysis (from trivial to complex) but does not need pixels?
    4. Client that actually manipulates compressed data and passes it upstream - remember one can crop, rotate e.g. JPEG w/o complete decompression.
    5. Client that needs the uncompressed data in HSI format.
    6. Client that needs uncompressed data in RGB format with some filters (e.g. gamma, color profile) applied during decompression.

    Enough? Today they all get NV12 (which actually constitutes even bigger data loss in terms of half a bandwidth of U (Cb) and V(Cr) samples).

    Now, since Microsoft is implementing frame server, they have to decompress the data one way or another and even multiple ways. For that the uncompressed data has to land in memory and can be (conditionally) kept there in case some client can benefit from using it. The initial media graph design allowed splitters and anybody with a little coding proficiency can implement conditional splitter that only pushes data to the pins that have client (sinks) attached.

    Actually, correct implementation should take clients needs into account (and this information is already present and readily available from all the clients in a way of media type negotiations and attributes that control the graph behavior). Then it should apply decompressors and other filters only when needed paying close attention to cpu cache locality and serve requested data to appropriate clients from appropriate memory thru appropriate mechanisms. It will allow all kind of the optimizations in the potential permutations of the mix of aforementioned client and beyond.

    If Microsoft needs help in designing and implementing the frame server satisfying this simple (if not to say trivial) set of requirements - all it has to do is ask - instead of breaking huge class of applications and services.

    I wonder how Microsoft is planing to network stream Hollolens input? Via NV12? Or via yet another hack?

    "Developers, Developers, Developers..." :(

    0 讨论(0)
提交回复
热议问题