I need an API or library (preferably free) that will convert voice/speech through a microphone, into text (string).
Additionally, I will need an API or library that can do text-to-speech.
I'd like to use C# and .NET, but other languages will suffice.
Thanks.
You can use CMU Sphinx as it is pretty open and scalable solution and I think it can be used at both client and server side:
http://cmusphinx.sourceforge.net/
If you are looking for a Microsoft desktop solution then you can use SAPI:
http://msdn.microsoft.com/en-us/magazine/cc163663.aspx
On server side, you can use Microsoft Unified Communication, but do consider licencing as well:
http://www.microsoft.com/uc/en/gb/default.aspx
Update:
This thread has also some good reference:
Here is a complete example using C# and System.Speech for converting from speech to text
The code can be divided into 2 main parts:
configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events.
Step 1: Configuring the SpeechRecognitionEngine
_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.
Step 2: Handling the SpeechRecognitionEngine Events
_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);
_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);
private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///real-time results from the engine string realTimeResults = e.Result.Text; }
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///final answer from the engine string finalAnswer = e.Result.Text; }
That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use
_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);
instead of
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
There are a bunch of different options in these classes and they are worth exploring in more detail.
See Using c++ to call and use Windows Speech Recognition
Which says:
Microsoft provides speech recognition engines for both client and server versions of Windows. Both can be programmed with C++ or with .NET languages. The traditional API for programming in C++ is known as SAPI. The .NET framework namepsaces for client and server speech are System.Speech and Microsoft.Speech.
SAPI documentation - http://msdn.microsoft.com/en-us/library/ms723627(VS.85).aspx
The .NET namespace for client recognition is System.Speech - http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx. Windows Vista and 7 include the speech engine.
The .NET namespace for server recognition is Microsoft.Speech and the complete SDK for the 10.2 version is available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. The speech engine is a free download.
Lots of earlier questions have addressed this. See Prototype based on speech recognition , getting started with speech recognition and speech synthesis , and SAPI and Windows 7 Problem for examples.
For text to speech conversion you have to follow 3 steps:
1.Add System.Speech reference.
2.Add Headers:
using System.Speech;
using System.Speech.Synthesis;
3.Add the following code where textBox1 is a Text Box default name.
SpeechSynthesizer speaker = new SpeechSynthesizer();
speaker.Rate = 1;
speaker.Volume = 100;
speaker.Speak(textBox1.Text);
I'd like to use C# and .NET, but other languages will suffice.
Check this if you are open to C++
Festival
There is a builtIn DLL in every Windows OS for Text2Speach. You will find the according dll in c:\Programs\Shared Folders\Microsoft Shared\Speech\sapi.dll (sAPI - speach api) - I am not quite sure about the path - but in anyway you may search for sapi.dll.
Afterwards you may use the following code snippet
SpVoice oVoice = new SpVoice();
oVoice.Voice = oVoice.GetVoices("","").Item(0); // 0 indicating what kind of speaker you want
oVoice.Volume = 50;
oVoice.Speak("hello world", SpeechVoiceSpeakFlags.SVSFDefault);
oVoice = null;
来源:https://stackoverflow.com/questions/4677471/voice-speech-to-text