I want to use google text to speech in my windows form application, it will read a label. I added System.Speech reference. How can it read a label with a button click event?
UPDATE Google's TTS API is no longer publically available. The notes at the bottom about Microsoft's TTS are still relevant and provide equivalent functionality.
You can use Google's TTS API from your WinForm application by playing the response using a variation of this question's answer (it took me a while but I have a real solution):
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
this.FormClosing += (sender, e) =>
{
if (waiting)
stop.Set();
};
}
private void ButtonClick(object sender, EventArgs e)
{
var clicked = sender as Button;
var relatedLabel = this.Controls.Find(clicked.Tag.ToString(), true).FirstOrDefault() as Label;
if (relatedLabel == null)
return;
var playThread = new Thread(() => PlayMp3FromUrl("http://translate.google.com/translate_tts?q=" + HttpUtility.UrlEncode(relatedLabel.Text)));
playThread.IsBackground = true;
playThread.Start();
}
bool waiting = false;
AutoResetEvent stop = new AutoResetEvent(false);
public void PlayMp3FromUrl(string url)
{
using (Stream ms = new MemoryStream())
{
using (Stream stream = WebRequest.Create(url)
.GetResponse().GetResponseStream())
{
byte[] buffer = new byte[32768];
int read;
while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
}
ms.Position = 0;
using (WaveStream blockAlignedStream =
new BlockAlignReductionStream(
WaveFormatConversionStream.CreatePcmStream(
new Mp3FileReader(ms))))
{
using (WaveOut waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback()))
{
waveOut.Init(blockAlignedStream);
waveOut.PlaybackStopped += (sender, e) =>
{
waveOut.Stop();
};
waveOut.Play();
waiting = true;
stop.WaitOne(10000);
waiting = false;
}
}
}
}
}
NOTE: The above code requires NAudio to work (free/open source) and using
statements for System.Web
, System.Threading
, and NAudio.Wave
.
My Form1
has 2 controls on it:
label1
button1
with a Tag
of label1
(used to bind the button to its label)The above code can be simplified slightly if a you have different events for each button/label combination using something like (untested):
private void ButtonClick(object sender, EventArgs e)
{
var clicked = sender as Button;
var playThread = new Thread(() => PlayMp3FromUrl("http://translate.google.com/translate_tts?q=" + HttpUtility.UrlEncode(label1.Text)));
playThread.IsBackground = true;
playThread.Start();
}
There are problems with this solution though (this list is probably not complete; I'm sure comments and real world usage will find others):
stop.WaitOne(10000);
in the first code snippet. The 10000 represents a maximum of 10 seconds of audio to be played so it will need to be tweaked if your label takes longer than that to read. This is necessary because the current version of NAudio (v1.5.4.0) seems to have a problem determining when the stream is done playing. It may be fixed in a later version or perhaps there is a workaround that I didn't take the time to find. One temporary workaround is to use a ParameterizedThreadStart
that would take the timeout as a parameter to the thread. This would allow variable timeouts but would not technically fix the problem. To answer the other side of your question:
The System.Speech.Synthesis.SpeechSynthesizer
class is much easier to use and you can count on it being available reliably (where with the Google API, it could be gone tomorrow).
It is really as easy as including a reference to the System.Speech
reference and:
public void SaySomething(string somethingToSay)
{
var synth = new System.Speech.Synthesis.SpeechSynthesizer();
synth.SpeakAsync(somethingToSay);
}
This just works.
Trying to use the Google TTS API was a fun experiment but I'd be hard pressed to suggest it for production use, and if you don't want to pay for a commercial alternative, Microsoft's solution is about as good as it gets.
I know this question is a bit out of date but recently Google published Google Cloud Text To Speech API.
.NET Client version of Google.Cloud.TextToSpeech can be found here: https://github.com/jhabjan/Google.Cloud.TextToSpeech.V1
Here is short example how to use the client:
GoogleCredential credentials =
GoogleCredential.FromFile(Path.Combine(Program.AppPath, "jhabjan-test-47a56894d458.json"));
TextToSpeechClient client = TextToSpeechClient.Create(credentials);
SynthesizeSpeechResponse response = client.SynthesizeSpeech(
new SynthesisInput()
{
Text = "Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 32 voices"
},
new VoiceSelectionParams()
{
LanguageCode = "en-US",
Name = "en-US-Wavenet-C"
},
new AudioConfig()
{
AudioEncoding = AudioEncoding.Mp3
}
);
string speechFile = Path.Combine(Directory.GetCurrentDirectory(), "sample.mp3");
File.WriteAllBytes(speechFile, response.AudioContent);