Is there a way to use the Javascript SpeechRecognition API with an audio file?

给你一囗甜甜゛ 提交于 2020-08-17 19:45:26

问题


I want to use the SpeechRecognition api with an audio file (mp3, wave, etc.) Is that possible?


回答1:


The short answer is No.

The Web Speech Api Specification does not prohibit this (the browser could allow the end-user to choose a file to use as input), but the audio input stream is never provided to the calling javascript code (in the current draft version), so you don't have any way to read or change the audio that is input to the speech recognition service.

This specification was designed so that the javascript code will only have access to the result text coming from the speech recognition service.




回答2:


Basicly you may use it only with default audioinput device which is choosen on OS level...

Therefore you just need to play you file into your default audioinput

2 options possible:

1

  • Install https://www.vb-audio.com/Cable/
  • Update system settings to use VCable device as default audiooutput and audioinput
  • Play your file with any audio player you have
  • Recognize it... e.g. using even standard demo UI https://www.google.com/intl/fr/chrome/demos/speech.html

Tested this today, and it works perfectly :-)

2

THIS IS NOT TESTED BY ME, so I cannot confirm that this is working, but you may feed audio file into chrome using Selenium... just like

DesiredCapabilities capabilities = DesiredCapabilities.chrome(); 
ChromeOptions options = new ChromeOptions();
options.addArguments("--allow-file-access-from-files",
                     "--use-fake-ui-for-media-stream",
                     "--allow-file-access",
                     "--use-file-for-fake-audio-capture=D:\\PATH\\TO\\WAV\\xxx.wav",
                     "--use-fake-device-for-media-stream");
capabilities.setCapability(ChromeOptions.CAPABILITY, options);
ChromeDriver driver = new ChromeDriver(capabilities);

But I'm not sure if this stream will replace default audioinput




回答3:


According to MDN you CAN'T do that. You can't feed any stream into recognition service

That's a big problem... You even cannot select microphone used by SpeechRecognition

That is done by purpose, Google want's to sell their CLOUD SPEECH API

You need to use services like CLOUD SPEECH API




回答4:


Yes, it is possible to get the text transcript of the playback of an audio file using webkitSpeechRecognition. The quality of the transcript depends upon the quality of the audio playback.

const recognition = new webkitSpeechRecognition();

const audio = new Audio();

recognition.continuous = true;
recognition.interimResults = true;
recognition.onresult = function(event) {
  if (event.results[0].isFinal) {
    // do stuff with `event.results[0][0].transcript`
    console.log(event.results[0][0].transcript);
    recognition.stop();
  }
}

recognition.onaudiostart = e => {
  console.log("audio capture started");
}

recognition.onaudioend = e => {
  console.log("audio capture ended");
}

audio.oncanplay = () => {
  recognition.start();
  audio.play();
}

audio.src = "/path/to/audio";

jsfiddle https://jsfiddle.net/guest271314/guvn1yq6/



来源:https://stackoverflow.com/questions/46267816/is-there-a-way-to-use-the-javascript-speechrecognition-api-with-an-audio-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!