Matching Kinect Audio with Video

后端 未结 6 834
刺人心
刺人心 2021-02-01 16:46

I have a project dealing with video conferencing using the Kinect (or, more likely, four of them). Right now, my company uses these stupidly expensive cameras for our VTC rooms.

相关标签:
6条回答
  • The API provided by Microsoft Research doesn't actually provide this capability. Kinect is essentially multiple cameras, and a microphone array with each sensor having a unique driver stack so there is no linkage to the physical hardware device. The best way to achieve this would be to use the Windows API instead, by way of WMI and take the device ID's you get for the NUI camera, and microphones, and use WMI to find which USB bus they are attachted to (as each Kinect sensor has to be on its own bus) then you'll know which device matches what. This will be an expensive operation, so I would recommend you do this on start-up, or detection of the devices and keep the information persisted until a time you know the hardware configuration changes, or the application is reset. Using WMI through .NET is pretty well documented, but here is one article that specifically talks about USB devices through WMI/.NET: http://www.developerfusion.com/article/84338/making-usb-c-friendly/.

    0 讨论(0)
  • 2021-02-01 16:52

    Mannimarco,

    the only link I see is that a camera's UniqueDeviceName property equals it's 'device instance path'.

    Doing a little research in the device manager on my computer I can tell that the last 2 numbers at the end of the camera's UniqueDeviceName (0&3, 0&4) are incrementing values (based on controller + port?).

    My suggestion is that you sort your list of cameras based on those last digits, and sort your audiodevices on their DeviceID property. This way i suppose when you iterate over your camera list, you can use the corresponding index in the audiodevice list to match the 2 together.

    Btw, this is my first post so please be gentle if I'm wrong...

    0 讨论(0)
  • 2021-02-01 16:54

    I would get the audio stream from all of them and then compare volume levels. Once you have that you can determine the "object" or person in the kinects 3d space that is actually speaking.

    From there you need to determine which cameras this object / person is visible in ...

    yeh this is one complex project ... kinect is pretty awesome though ... I don't know much about the API but does it not give you distances and such of people?

    good luck with it :)

    0 讨论(0)
  • 2021-02-01 16:57

    I have had a look at the SDK documentation and it is not great in all honesty. Further more I do not have any Kinect devices to test this on.

    The first thing I would do thou is to create an output list of all useful property values for each device, then I would start to look for matches across the two that look like they can be used for links. For each one I find, I would test to see if it does the job.

    So I would have a simple console application to output the following property values:

    For Each AudioDeviceInfo

    • DeviceID = X
    • DeviceIndex = X
    • DeviceName = X

    For Each KinectAudioSource

    • MicrophoneIndex = X

    For Each Runtime

    • InstanceIndex = X

    then look for any matches in values. Nothing else in the SDK seems really useful. But there must be internal logic to the SDK when it return arrays of AudioDeviceInfo and Runtime.

    Anyway, I hope you get it right somehow

    0 讨论(0)
  • 2021-02-01 16:59

    I would just calibrate the kinects one by one, writing the unique device identifier pairs (camera id, microphone id) to a file. In your application you can then use that file at startup time to synchronize mircophone instances and camera instances (ie. create a table that relates one camera instance to one microphone instance). As camera and microphone inside the kinect probably have their own usb interface ic each (connected via an interal usb hub), there is technically no way to relate the two without prior calibration, as the two device identifier are probably completely unrelated. Also you might want to put labels on the Kinect units and reference these labels inside your initialization file.

    0 讨论(0)
  • 2021-02-01 17:12

    Sounds interesting, maybe you need some "automatic calibration".

    Maybe with some "remote power switches for each usb connection" (io card connected to the usb powerlines). So you could power-on one Kinect after the other automatically and now you know which microphone belongs to which camera.

    Or something like that...

    Regards! Stefan

    0 讨论(0)
提交回复
热议问题