问题
I want to use the SpeechRecognition api with an audio file (mp3, wave, etc.) Is that possible?
回答1:
The short answer is No.
The Web Speech Api Specification does not prohibit this (the browser could allow the end-user to choose a file to use as input), but the audio input stream is never provided to the calling javascript code (in the current draft version), so you don't have any way to read or change the audio that is input to the speech recognition service.
This specification was designed so that the javascript code will only have access to the result text coming from the speech recognition service.
回答2:
Basicly you may use it only with default
audioinput device which is choosen on OS level...
Therefore you just need to play you file into your default
audioinput
2 options possible:
1
- Install https://www.vb-audio.com/Cable/
- Update system settings to use VCable device as
default
audiooutput and audioinput - Play your file with any audio player you have
- Recognize it... e.g. using even standard demo UI https://www.google.com/intl/fr/chrome/demos/speech.html
Tested this today, and it works perfectly :-)
2
THIS IS NOT TESTED BY ME, so I cannot confirm that this is working, but you may feed audio file into chrome using Selenium... just like
DesiredCapabilities capabilities = DesiredCapabilities.chrome();
ChromeOptions options = new ChromeOptions();
options.addArguments("--allow-file-access-from-files",
"--use-fake-ui-for-media-stream",
"--allow-file-access",
"--use-file-for-fake-audio-capture=D:\\PATH\\TO\\WAV\\xxx.wav",
"--use-fake-device-for-media-stream");
capabilities.setCapability(ChromeOptions.CAPABILITY, options);
ChromeDriver driver = new ChromeDriver(capabilities);
But I'm not sure if this stream will replace default
audioinput
回答3:
According to MDN you CAN'T do that. You can't feed any stream into recognition service
That's a big problem... You even cannot select microphone used by SpeechRecognition
That is done by purpose, Google want's to sell their CLOUD SPEECH API
You need to use services like CLOUD SPEECH API
回答4:
Yes, it is possible to get the text transcript of the playback of an audio file using webkitSpeechRecognition
. The quality of the transcript depends upon the quality of the audio playback.
const recognition = new webkitSpeechRecognition();
const audio = new Audio();
recognition.continuous = true;
recognition.interimResults = true;
recognition.onresult = function(event) {
if (event.results[0].isFinal) {
// do stuff with `event.results[0][0].transcript`
console.log(event.results[0][0].transcript);
recognition.stop();
}
}
recognition.onaudiostart = e => {
console.log("audio capture started");
}
recognition.onaudioend = e => {
console.log("audio capture ended");
}
audio.oncanplay = () => {
recognition.start();
audio.play();
}
audio.src = "/path/to/audio";
jsfiddle https://jsfiddle.net/guest271314/guvn1yq6/
来源:https://stackoverflow.com/questions/46267816/is-there-a-way-to-use-the-javascript-speechrecognition-api-with-an-audio-file