speech-to-text

Google Speech Recognition API: timestamp for each word?

为君一笑 提交于 2019-12-09 04:44:37
问题 It's possible to use Google's Speech recognition API to get a transcription for an audio file (WAV, MP3, etc.) by doing a request to http://www.google.com/speech-api/v2/recognize?... Example: I have said " one two three for five " in a WAV file. Google API gives me this: { u'alternative': [ {u'transcript': u'12345'}, {u'transcript': u'1 2 3 4 5'}, {u'transcript': u'one two three four five'} ], u'final': True } Question: is it possible to get the time (in seconds) at which each word has been

Android SpeechRecognizer Network Error

大憨熊 提交于 2019-12-09 03:37:59
问题 I'm trying to create a continous speech recognition in Android 4.4, simple displaying the spoken words in a TextView, like a dictate. I followed multiple tutorials, like https://github.com/fcrisciani/android-speech-recognition/blob/master/VoiceRecognition/src/com/speech/fcrisciani/voicerecognition/ContinuousDictationFragment.java, or Is there a way to use the SpeechRecognizer API directly for speech input? and implemented the following version: import java.util.ArrayList; import android.app

TargetInvocationException when using SemanticResultKey

99封情书 提交于 2019-12-08 16:55:25
问题 I want to build my grammar to accept multiple number. It has a bug when I repeat the number like saying 'twenty-one'. So I kept reducing my code to find the problem. I reached the following piece of code for the grammar builder: string[] numberString = { "one" }; Choices numberChoices = new Choices(); for (int i = 0; i < numberString.Length; i++) { numberChoices.Add(new SemanticResultValue(numberString[i], numberString[i])); } gb[1].Append(new SemanticResultKey("op1", (GrammarBuilder

Microsoft Cognitive services - Speech customization testing processing seems freezed

て烟熏妆下的殇ゞ 提交于 2019-12-08 12:53:10
问题 I upload sucessfully data to speech customization (wav audio+ txt transcription) for just one audio in a zip file according to Microsoft docs: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-data. When i click to add a test i choose data and it's taking an eternity to process results and never stops processing. My audio is in pt-BR model. Any idea? I cannot interrupt or deleting tests while it's processing 回答1: There is currently an issue in

Android Speech to Text Api Google - notification

爷,独闯天下 提交于 2019-12-08 08:48:54
问题 I followed this tuto: https://jbinformatique.com/2018/02/16/android-speech-to-text-api-google-tutoriel/ It works nice ! It uses android.speech.RecognizerIntent package it's free and it works without Internet as mentionned here: Difference between Android Speech to Text API (Recognizer Intent) and Google Cloud Speech API? However when I start the speech recognition, I get the following notification : If I translate (as I can..), it says : "Your audio records will be sent to Google and used for

how to add a custom lexicon in my c # project

ぐ巨炮叔叔 提交于 2019-12-08 05:46:09
问题 I m developing an c# project based on voice recognition. I want to recognize words in Indian English accent so for that i thought for lexicon & then adding pronunciations in that file but I m not getting how to add a lexicon in my project & how to create a lexicon? 回答1: Lexicons aren't exposed via System.Speech.Recognition, unfortunately. You can access lexicons using the SpeechLib automation interface to SAPI, though; the object you want to create is SpLexicon. Note that System.Speech

Microsoft speech API 5.1, 5.3?

你离开我真会死。 提交于 2019-12-08 05:04:37
问题 I'm a little confuse between the different SAPI version available. First of all, I only find the SDK to develop with the 5.1 version, is there any SDK for the 5.3 version available, if not, why ? Witch version can I use if I'm developing with the 3.5 version of the .Net framework. Is there any good tutorial because the only one I found are pretty old (they use 2003 version of visual studio) : http://msdn.microsoft.com/en-us/library/ms986944.aspx Is there any way I can use the speech API

using multiple variables to open different links

时光总嘲笑我的痴心妄想 提交于 2019-12-08 05:02:36
问题 Edit, Updated see my main goal is first to let the user request a specific (book_name) by the voice (speech to text) then the book open to read it loudly (text to speech) and the last step is to print this book to braille, my stuck points is I didn't found a way to open the book as pdf so I just left him as a text area, and I don't know how to convert the text to braille letters with this code https://gist.github.com/meh/141520 My goal is to open different pages when calling different

How to get alternate single words during dictation in SAPI 5.4 using C#?

浪子不回头ぞ 提交于 2019-12-08 04:59:01
问题 I am running a user study with speech recognition and new technologies. During the laboratory tests, I need to display all the dictated text using an interface that I programmed. Currently, I can get the alternate whole sentences in C# but I need to get the single words. For example, if someone says "Hello, my name is Andrew", I want to get an alternate word for "Hello", "my", "name", "is" and "Andrew", instead of an alternate for the complete sentence. Here is a code snippet of the handler I

How to add Continues Speech Recognition in my Android Application?

一曲冷凌霜 提交于 2019-12-07 20:54:18
问题 I am try to implement Continues Speech Recognition in my Android Application. I have followed this Link coding. this Continues Speech Recognition worked before two days. But now Speech Recognition not working good it will be taking more time for speech listening. how to resolve this problem. Please guide me. Thanks Recognition coding: // starts the service protected void startListening() { try { initSpeech(); Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); //intent