speech-to-text

Using System.Speech to convert mp3 file to text

◇◆丶佛笑我妖孽 提交于 2019-12-03 03:58:20
问题 I'm trying to use the speech recognition in .net to recognize the speech of a podcast in an mp3 file and get the result as string. All the examples I've seen are related to using microphone but I don't want to use the microphone and provide a sample mp3 file as my audio source. Can anyone point me to any resource or post an example. EDIT - I converted the audio file to wav file and tried this code on it. But it only extracts the first 68 words. public class MyRecognizer { public string

C#: transcribe WAV file to text (speech-to-text) with System.Speech namespaces

隐身守侯 提交于 2019-12-03 03:15:55
How do you use the .NET speech namespace classes to convert audio in a WAV file to textual form which I can display on the screen or save to file? I am looking for some tutorial samples. UPDATE Found a code sample here . But when I tried it it gives incorrect results. Below is the vb code sample I've adopted. (Actually I don't mind the lang as long as its either vb/c#...). It is not giving me proper results. I assume if we put the right grammar - i.e. the words we expect in the recording - we should get the textual output of that. First I've tried with sample words that are in the call. It

Google Speech Recognition API: timestamp for each word?

ぃ、小莉子 提交于 2019-12-03 01:48:56
It's possible to use Google's Speech recognition API to get a transcription for an audio file (WAV, MP3, etc.) by doing a request to http://www.google.com/speech-api/v2/recognize?... Example: I have said " one two three for five " in a WAV file. Google API gives me this: { u'alternative': [ {u'transcript': u'12345'}, {u'transcript': u'1 2 3 4 5'}, {u'transcript': u'one two three four five'} ], u'final': True } Question: is it possible to get the time (in seconds) at which each word has been said? With my example: ['one', 0.23, 0.80], ['two', 1.03, 1.45], ['three', 1.79, 2.35], etc. i.e. the

How to implement speech-to-text via Speech framework [closed]

风格不统一 提交于 2019-12-03 01:02:57
Closed . This question needs to be more focused. It is not currently accepting answers. Learn more . Want to improve this question? Update the question so it focuses on one problem only by editing this post . I want to do speech recognition in my Objective-C app using the iOS Speech framework. I found some Swift examples but haven't been able to find anything in Objective-C. Is it possible to access this framework from Objective-C? If so, how? After spending enough time looking for Objective-C samples -even in the Apple documentation- I couldn't find anything decent, so I figured it out myself

Open Source Software For Transcribing Speech in Audio Files

只愿长相守 提交于 2019-12-02 20:36:26
Can anyone recommend reliable open source software for transcribing English speech in wav files? The two main programs I've researched are Sphinx and Julius , but I've never been able to get either to work, and the documentation with each on transcribing files is sketchy at best. I'm developing on 64-bit Ubuntu 10.04, whose repos include sphinx2 and julius, as well as voxforge's julius acoustic modal for English. I'm focussing on transcribing files, instead of directly processing sound from a mic, because I've given up on expecting projects like these to work with Ubuntu's sound system. This

How to track rate of speech

前提是你 提交于 2019-12-02 18:13:47
问题 I am developing an iPhone app that tracks rate of speech, and hoping to use Nuance Speechkit (https://developer.nuance.com/public/Help/DragonMobileSDKReference_iOS/SpeechKit_Guide/Basics.html) Is there a way to track rate of speech (e.g., updating WPM every few seconds) with the framework? Right now it seems to just do speech-to-text at the end of a long utterance, as opposed to every word or so (i.e., return partial results). 回答1: There are easier ways, for example you can use CMUSphinx with

How to use CMU Sphinx 4 for speech to text with english voxforge models

二次信任 提交于 2019-12-02 17:45:30
I'm trying to figure out how to use sphinx4 or pocketsphinx with the english voxforge model but I can't get it working. I have tried to read doc pages (like this one http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html ) but it does not help me. What I want is an executable where I can specify which model to use and which audio file to use as source and have the executable print out it's best guess about what the voice on the recording says. I hade some luck with: pocketsphinx_continuous -infile recording.wav 2> /dev/null But it aborts before the complete audio file is

Using System.Speech to convert mp3 file to text

时光总嘲笑我的痴心妄想 提交于 2019-12-02 17:20:53
I'm trying to use the speech recognition in .net to recognize the speech of a podcast in an mp3 file and get the result as string. All the examples I've seen are related to using microphone but I don't want to use the microphone and provide a sample mp3 file as my audio source. Can anyone point me to any resource or post an example. EDIT - I converted the audio file to wav file and tried this code on it. But it only extracts the first 68 words. public class MyRecognizer { public string ReadAudio() { SpeechRecognitionEngine sre = new SpeechRecognitionEngine(); Grammar gr = new DictationGrammar(

Speech to text Conversion.?

佐手、 提交于 2019-12-02 13:08:03
For My Iphone Application I need a speech to text library. Can any one suggest me a solution. After two days digging what i found is Google speech to text API and open source OpenEars Library. Can any one suggest one of these.?Which one is better.? Michael Levy I don't think the Google APIs are intended for public use. They are services hosted by Google for Android and Chrome. People have reversed engineered the API and built some libraries to let people use it, but I wouldn't build a commercial application that relied on it (unless of course it was an Android or Chrome application). For

How to track rate of speech

梦想与她 提交于 2019-12-02 10:39:19
I am developing an iPhone app that tracks rate of speech, and hoping to use Nuance Speechkit ( https://developer.nuance.com/public/Help/DragonMobileSDKReference_iOS/SpeechKit_Guide/Basics.html ) Is there a way to track rate of speech (e.g., updating WPM every few seconds) with the framework? Right now it seems to just do speech-to-text at the end of a long utterance, as opposed to every word or so (i.e., return partial results). There are easier ways, for example you can use CMUSphinx with phonetic recognizer to recognize just phonemes instead of words. It would work locally on the device and