speech-to-text | 易学教程

Speech to text from wav file

阅读更多关于 Speech to text from wav file

问题 Merged with Speech to text from wav file Java. Is it possible to input a wav file to the Java Speech API? 来源： https://stackoverflow.com/questions/5382074/speech-to-text-from-wav-file

Google Speech API returns NULL

阅读更多关于 Google Speech API returns NULL

问题 Trying to develop a speech to text application using Google's API with below code import java.io.BufferedReader; import java.io.DataOutputStream; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.URL; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import org.testng.annotations.Test; public class Speech2Text_Test { @Test public void f() { try{ Path path = Paths.get("out.flac"); byte[] data = Files.readAllBytes(path); String

Google Speech API streaming audio exceeding 1 minute

阅读更多关于 Google Speech API streaming audio exceeding 1 minute

问题 I would like to be able to extract utternaces of a person from a stream of telephone audio. The phone audio is routed to my server which then creates a streaming recognition request. How can I tell when a word exists as part of a complete utterance or is part of an utterance currently being transcribed? Should I compare timestamps between words? Will the API continue to return interim results even if there is no speech for a certain amount of time in the streaming phone audio? How can I

How to turn on always the microphone in bot framework direct line

阅读更多关于 How to turn on always the microphone in bot framework direct line

问题 I'm creating a bot which accept text and voice input and also can answer in both mode. The bot works really good but i have to click always the button of microphone to speak with the bot. Do you know it is possible to make microphone always on and to recognize the voice without clicking the button of microphone ? <!DOCTYPE html> <html> <head> <link href="https://cdn.botframework.com/botframework- webchat/latest/botchat.css" rel="stylesheet" /> </head> <body> <div id="bot" /> <script src=

chrome speech recognition WebKitSpeechRecognition() not accepting input of fake audio device --use-file-for-fake-audio-capture or audio file

阅读更多关于 chrome speech recognition WebKitSpeechRecognition() not accepting input of fake audio device --use-file-for-fake-audio-capture or audio file

问题 I would like to use chrome speech recognition WebKitSpeechRecognition() with the input of an audio file for testing purposes. I could use a virtual microphone but this is really hacky and hard to implement with automation, but when I tested it everything worked fine and the speechrecognition converted my audio file to text. now I wanted to use the following chrome arguments: --use-file-for-fake-audio-capture="C:/url/to/audio.wav" --use-fake-device-for-media-stream --use-fake-ui-for-media

measuring rate of speech in realtime

阅读更多关于 measuring rate of speech in realtime

问题 I'm looking for a quick and simple way to measure the rate at which I am speaking in real time. Course grained approaches or approximations are sufficient. The idea is to write a simple app/widget that at least tells you to speed up or slow down while speaking. Measuring things like pitch and volume might also be nice. I assume this can be done simply with a variety of speech recognition libraries, but I am familiar with none of them and quick glances at the documentation do not give a simple

Uncaught DOMException: Failed to construct 'AudioContext': The number of hardware contexts provided (6)

阅读更多关于 Uncaught DOMException: Failed to construct 'AudioContext': The number of hardware contexts provided (6)

问题 i am trying to implement microsoft bing speech api and its working fine for the first 5 times after that when i record my voice i getting exception in console . Exception : Uncaught DOMException: Failed to construct 'AudioContext': The number of hardware contexts provided (6) is greater than or equal to the maximum bound (6). when i try to close with AudioContext.close() it shows another error like "Uncaught (in promise) DOMException: Cannot close a context that is being closed or has already

IBM Watson Speech to Text Service is not giving response in Unity3d

阅读更多关于 IBM Watson Speech to Text Service is not giving response in Unity3d

问题 I have an ExampleSstreaming class which actually I got from GitHub of IBM Watson SDK (speech to text service demo). Here it is public class ExampleStreaming : MonoBehaviour { private int m_RecordingRoutine = 0; private string m_MicrophoneID = null; private AudioClip m_Recording = null; private int m_RecordingBufferSize = 5; private int m_RecordingHZ = 22050; private SpeechToText m_SpeechToText = new SpeechToText(); void Start() { LogSystem.InstallDefaultReactors(); Log.Debug("ExampleStreaming

How to specify phonetic keywords for IBM Watson speech2text service?

阅读更多关于 How to specify phonetic keywords for IBM Watson speech2text service?

问题 While we have had good success with Bluemix Java SDK in the general case, we've bumped into problems while trying to recognize occasional non-English words (e.g., foreign last names). Our hope was that one could specify the keyword list using SPR phonetic notation (which works great for text2speech), but that does not seem to be supported for speech2text. Any suggestions/workarounds? SpeechToText service = new SpeechToText(); service.setUsernameAndPassword("USERNAME", "PASSWORD"); File audio

Mp3 / Wav to Text

阅读更多关于 Mp3 / Wav to Text

问题 I currently have a mobile application that can record speech as either a WAV or MP3 and would like to convert it to text. I have looked around - Microsoft Speech, UCMA, etc -- but haven't seen any good examples of how to do it. Can someone help out here? FYI - We have access to MS Lync. Look forward to any responses, James 回答1: There's a sample of using the UCMA 3.0 SDK to perform speech recognition, available here. However, from experience (and I've love to be proved wrong here) you can only