问题
I'm going to create voice chat. My backend server works on Node.js and almost every connection between client and server uses socket.io.
Is websockets appropriate for my use case? I prefer communication client -> server -> clients than P2P because I expect even 1000 clients connected to one room.
If websocket is ok, then which method is the best to send AudioBuffer to server and playback on other clients? I do it like that:
navigator.getUserMedia({audio: true}, initializeRecorder, errorCallback);
function initializeRecorder(MediaStream) {
var audioCtx = new window.AudioContext();
var sourceNode = audioCtx.createMediaStreamSource(MediaStream);
var recorder = audioCtx.createScriptProcessor(4096, 1, 1);
recorder.onaudioprocess = recorderProcess;
sourceNode.connect(recorder);
recorder.connect(audioCtx.destination);
}
function recorderProcess(e) {
var left = e.inputBuffer.getChannelData(0);
io.socket.post('url', left);
}
But after receive data on other clients I don't know how to playback this Audio Stream from Buffer Arrays.
EDIT
1) Why if I don't connect ScriptProcessor (recorder variable) to destination, onaudioprocess method isn't fired?
Documentation info - "although you don't have to provide a destination if you, say, just want to visualise some audio data" - Web Audio concepts and usage
2) Why I don't hear anything from my speakers after connect recorder variable to destination and if I connect sourceNode variable directly to destination, I do. Even if onaudioprocess method doesn't do anything.
Anyone can help?
回答1:
I think web sockets are appropriate here. Just make sure that you are using binary transfer. (I use BinaryJS for this myself, allowing me to open up arbitrary streams to the server.)
Getting the data from user media capture is pretty straightforward. What you have is a good start. The tricky party is on playback. You will have to buffer the data and play it back using your own script processing node.
This isn't too hard if you use PCM everywhere... the raw samples you get from the Web Audio API. The downside of this is that there is a lot of overhead shoving 32-bit floating point PCM around. This uses a ton of bandwidth which isn't needed for speech alone.
I think the easiest thing to do in your case is to reduce the bit depth to an arbitrary bit depth that works well for your application. 8-bit samples are plenty for discernible speech and will take up quite a bit less bandwidth. By using PCM, you avoid having to implement a codec in JS and then having to deal with the buffering and framing of data for that codec.
To summarize, once you have the raw sample data in a typed array in your script processing node, write something to convert those samples from 32-bit float to 8-bit signed integers. Send these buffers to your server in the same size chunks as they come in on, over your binary web socket. The server will then send these to all the other clients on their binary web sockets. When the clients receive audio data, it will buffer it for whatever amount of time you choose to prevent dropping audio. Your client code will convert those 8-bit samples back to 32-bit float and put it in a playback buffer. Your script processing node will pick up whatever is in the buffer and start playback as data is available.
来源:https://stackoverflow.com/questions/31995677/audio-streaming-by-websockets