speech-to-text

25s Latency in Google Speech to Text

我只是一个虾纸丫 提交于 2019-12-22 13:53:10
问题 This is a problem I ran into using the Google Speech to Text Engine. I am currently streaming 16 bit / 16 kHz audio real time in 32kB chunks. But there is an average 25 second latency between sending audio and receiving transcripts, defeating the purpose of real time transcription. Why is there such high latency? 回答1: The Google Speech to Text documentation recommends using a 100 ms frame size to minimize latency. 32kB * (8 bits / 1 byte) * ( 1 sample / 16 bits ) * (1 sec / 16000 samples ) =

Using c++ to call and use Windows Speech Recognition [closed]

那年仲夏 提交于 2019-12-22 10:48:25
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I am making an application that involves the use of windows speech recognition. I am thinking of using c++ to do this since i have some experience with this language. The way i want to use the speech recognition is so that it works internally. If i upload an audio file into my program, i want speech recognition

Trouble passing string variable to return data from python function to be used globally anywhere in a python script or program - EDITED for clarity

僤鯓⒐⒋嵵緔 提交于 2019-12-22 01:07:34
问题 I am editing my question to reflect the issue I am having in my application. I am trying to take a streamed audio and convert it to text using Google text to speech. Then pass that that text as input to a conversation not on Watson. Watson then returns its answer. The latter half works great. The issue I am having is that I can't get the script to pass the text from the recorded speech to the Watson service I created. I don't get an error, I just get nothing. The mic is working (I tested it

PyAudio prints ALSA warnings and does not work

六眼飞鱼酱① 提交于 2019-12-21 23:16:19
问题 hey guys i'm trying to run a basic python speech to text code. This is the code. import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: audio = r.listen(source) try: print("You said " + r.recognize(audio)) except LookupError: print("Could not understand audio") The code works fine till it reaches the print stage and then throws this error. Is there anything that i have done wrong? ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm

How to Auto stop speech recognition if user stop speaking

给你一囗甜甜゛ 提交于 2019-12-21 22:45:55
问题 I am working on Bot app and here I have 2 features Speech to Text Text to Speech Both are working as expected but I want to detect that when user stop speaking at that time I want to stop detection and send that data to server. Is there any way to get that user is not speaking ? I am using below code for speech detection : // Starts an AVAudio Session NSError *error; AVAudioSession *audioSession = [AVAudioSession sharedInstance]; [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord

How to implement speech-to-text via Speech framework [closed]

时光总嘲笑我的痴心妄想 提交于 2019-12-20 10:37:42
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago . I want to do speech recognition in my Objective-C app using the iOS Speech framework. I found some Swift examples but haven't been able to find anything in Objective-C. Is it possible to access this framework from Objective-C? If so, how? 回答1: After spending enough time looking

C# system.speech.recognition alternate words

前提是你 提交于 2019-12-20 04:18:50
问题 I am currently using the Microsoft.Speech API to dictate utterances into text, but what I really need is the alternative dictations the program could use. I am using this for my honours thesis, and for it I wish to know the top 10 interpretations of any utterance. A very similar, if not exact question was asked in 2011: C# system.speech.recognition alternates But was never answered. My question thus is: how does one get the alternatives to an interpretation of a dictation using the Microsoft

How to get the authentication token for IBM watson STT service?

丶灬走出姿态 提交于 2019-12-19 04:35:40
问题 I am trying to use the Watson Speech To Text service which needs the following command for the websocket Interface as per the documentation var token = {authentication-token}; var wsURI = 'wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize' + '?watson-token=' + token + '&model=es-ES_BroadbandModel'; I have tried this to get the {authentication-token} using curl command on terminal curl -X GET --user "apikey:{apikey}" "https://stream.watsonplatform.net/authorization/api/v1/token

Speech recognition response is poor in sphinx4

怎甘沉沦 提交于 2019-12-18 12:42:02
问题 Currently we are investigating into using sphinx4 for speech recognition. We are trying to achieve a good response for a dictation type application. The input is a wav file and we wish to transcribe it. I have looked into the LatticeDemo and Transcriber demo provided by Sphinx4. When i utilize the same configuration , the response is pretty poor. I have tried to tweak in the configuration files but it simply does not recognize the words. the transcriber demo provided is for digits, i have

Android Speech Recognition not working

允我心安 提交于 2019-12-18 09:35:30
问题 I'm using this example from newboston and it prompt me for recording but after it recognized what I said, it won't update the list view. Here is the code. public class MainActivity extends Activity { private static final int RECOGNIZER_RESULT = 1234; ListView list; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); list = (ListView) findViewById(R.id.list); Button btn_speach = (Button)findViewById(R.id.btn