i\'m trying to use tensorflowjs speech recognition in offline mode. online mode using microphone is working fine. but for offline mode i\'m not able to find any reliable library
The only requirement when working with offline recognition is to have an input tensor of shape [null, 43, 232, 1]
.
1 - Read the wav file and get the array of data
var spectrogram = require('spectrogram');
var spectro = Spectrogram(document.getElementById('canvas'), {
audio: {
enable: false
}
});
var audioContext = new AudioContext();
readWavFile() {
return new Promise(resove => {
var request = new XMLHttpRequest();
request.open('GET', 'audio.mp3', true);
request.responseType = 'arraybuffer';
request.onload = function() {
audioContext.decodeAudioData(request.response, function(buffer) {
resolve(buffer)
});
};
request.send()
})
}
const buffer = await readWavFile()
The same thing can be done without using the third party library. 2 options are possible.
Read the file using <input type="file">
. In that case, this answer shows how to get the typedarray.
Serve and read the wav file using a http request
var req = new XMLHttpRequest();
req.open("GET", "file.wav", true);
req.responseType = "arraybuffer";
req.onload = function () {
var arrayBuffer = req.response;
if (arrayBuffer) {
var byteArray = new Float32Array(arrayBuffer);
}
};
req.send(null);
2- convert the buffer to typedarray
const data = Float32Array(buffer)
3- convert the array to a tensor using the shape of the speech recognition model
const x = tf.tensor(
data).reshape([-1, ...recognizer.modelInputShape().slice(1));
If the latter commands fails, it means that the data does not have the shape needed for the model. The tensor needs to be sliced to have the appropriate shape or the recording made should respect the fft
and other parameters.