speech-to-text | 易学教程

Google Speech Recognition API: timestamp for each word?

阅读更多关于 Google Speech Recognition API: timestamp for each word?

问题 It's possible to use Google's Speech recognition API to get a transcription for an audio file (WAV, MP3, etc.) by doing a request to http://www.google.com/speech-api/v2/recognize?... Example: I have said " one two three for five " in a WAV file. Google API gives me this: { u'alternative': [ {u'transcript': u'12345'}, {u'transcript': u'1 2 3 4 5'}, {u'transcript': u'one two three four five'} ], u'final': True } Question: is it possible to get the time (in seconds) at which each word has been

Android SpeechRecognizer Network Error

阅读更多关于 Android SpeechRecognizer Network Error

问题 I'm trying to create a continous speech recognition in Android 4.4, simple displaying the spoken words in a TextView, like a dictate. I followed multiple tutorials, like https://github.com/fcrisciani/android-speech-recognition/blob/master/VoiceRecognition/src/com/speech/fcrisciani/voicerecognition/ContinuousDictationFragment.java, or Is there a way to use the SpeechRecognizer API directly for speech input? and implemented the following version: import java.util.ArrayList; import android.app

TargetInvocationException when using SemanticResultKey

阅读更多关于 TargetInvocationException when using SemanticResultKey

问题 I want to build my grammar to accept multiple number. It has a bug when I repeat the number like saying 'twenty-one'. So I kept reducing my code to find the problem. I reached the following piece of code for the grammar builder: string[] numberString = { "one" }; Choices numberChoices = new Choices(); for (int i = 0; i < numberString.Length; i++) { numberChoices.Add(new SemanticResultValue(numberString[i], numberString[i])); } gb[1].Append(new SemanticResultKey("op1", (GrammarBuilder

Microsoft Cognitive services - Speech customization testing processing seems freezed

阅读更多关于 Microsoft Cognitive services - Speech customization testing processing seems freezed

问题 I upload sucessfully data to speech customization (wav audio+ txt transcription) for just one audio in a zip file according to Microsoft docs: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-data. When i click to add a test i choose data and it's taking an eternity to process results and never stops processing. My audio is in pt-BR model. Any idea? I cannot interrupt or deleting tests while it's processing 回答1: There is currently an issue in

Android Speech to Text Api Google - notification

阅读更多关于 Android Speech to Text Api Google - notification

问题 I followed this tuto: https://jbinformatique.com/2018/02/16/android-speech-to-text-api-google-tutoriel/ It works nice ! It uses android.speech.RecognizerIntent package it's free and it works without Internet as mentionned here: Difference between Android Speech to Text API (Recognizer Intent) and Google Cloud Speech API? However when I start the speech recognition, I get the following notification : If I translate (as I can..), it says : "Your audio records will be sent to Google and used for

how to add a custom lexicon in my c # project

阅读更多关于 how to add a custom lexicon in my c # project

问题 I m developing an c# project based on voice recognition. I want to recognize words in Indian English accent so for that i thought for lexicon & then adding pronunciations in that file but I m not getting how to add a lexicon in my project & how to create a lexicon? 回答1: Lexicons aren't exposed via System.Speech.Recognition, unfortunately. You can access lexicons using the SpeechLib automation interface to SAPI, though; the object you want to create is SpLexicon. Note that System.Speech

Microsoft speech API 5.1, 5.3?

阅读更多关于 Microsoft speech API 5.1, 5.3?

问题 I'm a little confuse between the different SAPI version available. First of all, I only find the SDK to develop with the 5.1 version, is there any SDK for the 5.3 version available, if not, why ? Witch version can I use if I'm developing with the 3.5 version of the .Net framework. Is there any good tutorial because the only one I found are pretty old (they use 2003 version of visual studio) : http://msdn.microsoft.com/en-us/library/ms986944.aspx Is there any way I can use the speech API

using multiple variables to open different links

阅读更多关于 using multiple variables to open different links

问题 Edit, Updated see my main goal is first to let the user request a specific (book_name) by the voice (speech to text) then the book open to read it loudly (text to speech) and the last step is to print this book to braille, my stuck points is I didn't found a way to open the book as pdf so I just left him as a text area, and I don't know how to convert the text to braille letters with this code https://gist.github.com/meh/141520 My goal is to open different pages when calling different

How to get alternate single words during dictation in SAPI 5.4 using C#?

阅读更多关于 How to get alternate single words during dictation in SAPI 5.4 using C#?

问题 I am running a user study with speech recognition and new technologies. During the laboratory tests, I need to display all the dictated text using an interface that I programmed. Currently, I can get the alternate whole sentences in C# but I need to get the single words. For example, if someone says "Hello, my name is Andrew", I want to get an alternate word for "Hello", "my", "name", "is" and "Andrew", instead of an alternate for the complete sentence. Here is a code snippet of the handler I

How to add Continues Speech Recognition in my Android Application?

阅读更多关于 How to add Continues Speech Recognition in my Android Application?

问题 I am try to implement Continues Speech Recognition in my Android Application. I have followed this Link coding. this Continues Speech Recognition worked before two days. But now Speech Recognition not working good it will be taking more time for speech listening. how to resolve this problem. Please guide me. Thanks Recognition coding: // starts the service protected void startListening() { try { initSpeech(); Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); //intent