We need some heads up for a hobby web project. At this stage we want to detect client\'s sound card and direct whatever coming from sound card to server to process audio. And lo
You can use the Web Audio API along with getUserMedia (generally considered part of the WebRTC feature set) to capture audio (and video if you want it) from the user. Here is a code example from the excellent HTML5Rocks.com tutorial on getUserMedia:
window.AudioContext = window.AudioContext ||
window.webkitAudioContext;
var context = new AudioContext();
navigator.getUserMedia({audio: true}, function(stream) {
var microphone = context.createMediaStreamSource(stream);
var filter = context.createBiquadFilter();
// microphone -> filter -> destination.
microphone.connect(filter);
filter.connect(context.destination);
}, errorCallback);
In this example, we ask for the user for access to their "microphone". If they give it to us, we create a regular AudioNode using AudioContext::createMediaStreamSource(). Once we have this node, it can be connected to other nodes in the chain allowing you to do whatever you want with it.
This requires a modern browser. Only Chrome and Firefox currently support both of these features together. (Check the updated compatibility for both Web Audio and getUserMedia, but note that just because a browser supports both doesn't guarantee it will work.) Additionally, there are browser quirks you may have to work around. The one that bugs me the most is a Chrome issue where your nodes are garbage collected while they are still in use. Firefox also has problems with the SplitterNode where not all the channels work. Detecting browser support isn't always straightforward either, as some browsers (notably Chrome on Android) claim to have support and even run your AudioContext, but simply return null samples from the getUserMediaStream. In short, test everywhere and if you run into an issue don't expect to get much help from the browser developers. It's not an adopted standard yet (but is quite mature).
You have no control over the latency. You have no way of detecting the actual sound device. All you can determine is if the browser gives you access to the audio channels, and how many of them. I've also found that regardless of what is on the other side, Chrome will only open up a stereo channel. (It will also treat mono devices as stereo, copying the first channel to both L and R channels.) Firefox has similar issues.
ASIO support is non-existent at the moment. (I've considered writing a browser extension to support it, but this wouldn't fix latency issues unless someone compiled native extensions into the browsers.) The OS API used by the browser is completely out of your control. In Chrome on Windows, you can force it to use exclusive mode which has been tested to around 4ms (your milage will certainly vary). You cannot rely on this though.
Finally, the sample rate is completely out of your control. You can determine it though by checking AudioContext.sampleRate
. The sample rate is fixed per session. You can also modify the sample rate by adjusting your system settings for the default sample rates. Make sure to set your input and output rates to the same, or you won't get any audio output (at least on Chrome).
You can create a script processing node and then dump this data back to your server over a binary websocket. I use BinaryJS for this purpose with Node.js on the backend, as it can multiplex several streams down a single actual WebSocket connection. (I need other streams for control. If you don't, maybe you don't need BinaryJS.)
All WebAudio samples are Float32. I convert these to 16 bit signed integers before sending over the wire to save bandwidth. If you choose to send the floats over the wire as-is, you should note that endianess does matter unfortunately. Convert carefully.
Hopefully this will save you some of the research time. The Web Audio API is really solid stuff, but the ground is always shifting. Something that was broke last week could be fixed this week and vice-versa. Stay up-to-date on browsers and get the beta copies as well for testing.