问题
I am using following javascript to record audio and send it to a websocket server:
const recordAudio = () =>
new Promise(async resolve => {
const constraints = {
audio: {
sampleSize: 16,
channelCount: 1,
sampleRate: 8000
},
video: false
};
var mediaRecorder;
const stream = await navigator.mediaDevices.getUserMedia(constraints);
var options = {
audioBitsPerSecond: 128000,
mimeType: 'audio/webm;codecs=pcm'
};
mediaRecorder = new MediaRecorder(stream, options);
var track = stream.getAudioTracks()[0];
var constraints2 = track.getConstraints();
var settings = track.getSettings();
const audioChunks = [];
mediaRecorder.addEventListener("dataavailable", event => {
audioChunks.push(event.data);
webSocket.send(event.data);
});
const start = () => mediaRecorder.start(30);
const stop = () =>
new Promise(resolve => {
mediaRecorder.addEventListener("stop", () => {
const audioBlob = new Blob(audioChunks);
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
const play = () => audio.play();
resolve({
audioBlob,
audioUrl,
play
});
});
mediaRecorder.stop();
});
resolve({
start,
stop
});
});
This is for realtime STT and the websocket server refused to send any response. I checked by debugging that the sampleRate is not changing to 8Khz.Upon researching, I found out that this is a known bug on both chrome and firefox. I found some other resources like stackoverflow1 and IBM_STT but I have no idea on how to adapt it to my code. The above helpful resources refers to buffer but all i have is mediaStream(stream) and event.data(blob) in my code. I am new to both javascript and Audio Api, so please pardon me if i did something wrong.
If this helps, I have an equivalent code of python to send data from mic to websocket server which works. Library used = Pyaudio. Code :
p = pyaudio.PyAudio()
stream = p.open(format="pyaudio.paInt16",
channels=1,
rate= 8000,
input=True,
frames_per_buffer=10)
print("* recording, please speak")
packet_size = int((30/1000)*8000) # normally 240 packets or 480 bytes
frames = []
#while True:
for i in range(0, 1000):
packet = stream.read(packet_size)
ws.send(packet, binary=True)
回答1:
To do realtime downsampling follow these steps:
First get stream instance using this:
const stream = await navigator.mediaDevices.getUserMedia(constraints);
Create media stream source from this stream.
var input = audioContext.createMediaStreamSource(stream);
Create script Processor so that you can play with buffers. I am going to create a script processor which takes 4096 samples from the stream at a time, continuously, has 1 input channel and 1 output channel.
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
Connect your input with scriptNode. You can connect script Node to the destination as per your requirement.
input.connect(scriptNode); scriptNode.connect(audioContext.destination);
Now there is a function onaudioprocess in scriptProcessor where you can do whatever you want with 4096 samples. var downsample will contain (1/sampling ratio) number of packets. floatTo16BitPCM will convert that to your required format since the original data is in 32 bit float format.
var inputBuffer = audioProcessingEvent.inputBuffer; // The output buffer contains the samples that will be modified and played var outputBuffer = audioProcessingEvent.outputBuffer; // Loop through the output channels (in this case there is only one) for (var channel = 0; channel < outputBuffer.numberOfChannels; channel++) { var inputData = inputBuffer.getChannelData(channel); var outputData = outputBuffer.getChannelData(channel); var downsampled = downsample(inputData); var sixteenBitBuffer = floatTo16BitPCM(downsampled); }
Your sixteenBitBuffer will contain the data you require.
Functions for downsampling and floatTo16BitPCM are explained in this link of Watson API:IBM Watson Speech to Text Api
You won't need MediaRecorder instance. Watson API is opensource and you can look for a better streamline approach on how they implemented it for their use case. You should be able to salvage important functions from their code.
来源:https://stackoverflow.com/questions/52817966/how-to-downsample-audio-recorded-from-mic-realtime-in-javascript