The right way to use SSML with Web Speech API

前端 未结 3 1972
-上瘾入骨i
-上瘾入骨i 2021-02-13 16:53

Web Speech API specification says:

text attribute
This attribute specifies the text to be synthesized and spoken for thi

相关标签:
3条回答
  • 2021-02-13 17:10

    There are bugs for this issue currently open with Chromium.

    • 88072: Extension TTS API platform implementations need to support SSML
    • 428902: speechSynthesis.speak() doesn't strip unrecognized tags This bug has been fixed in Chrome as of Sept 2016.
    0 讨论(0)
  • 2021-02-13 17:11

    I have tested this, and XML parsing seems to work properly in Windows, however it does not work properly in MacOS.

    0 讨论(0)
  • 2021-02-13 17:20

    In Chrome 46, the XML is being interpreted properly as an XML document, on Windows, when the language is set to en; however, I see no evidence that the tags are actually doing anything. I heard no difference between the <emphasis> and non-<emphasis> versions of this SSML:

    var msg = new SpeechSynthesisUtterance();
    msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><emphasis>Welcome</emphasis> to the Bird Seed Emporium.  Welcome to the Bird Seed Emporium.</speak>';
    msg.lang = 'en';
    speechSynthesis.speak(msg);
    

    The <phoneme> tag was also completely ignored, which made my attempt to speak IPA fail.

    var msg = new SpeechSynthesisUtterance();
    msg.text='<?xml version="1.0" encoding="ISO-8859-1"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream.  The name is pronounced <phoneme alphabet="ipa" ph="p&aelig;v&#712;lo&#650;v&#601;">...</phoneme> or <phoneme alphabet="ipa" ph="p&#593;&#720;v&#712;lo&#650;v&#601;">...</phoneme>, unlike the name of the dancer, which was <phoneme alphabet="ipa" ph="&#712;p&#593;&#720;vl&#601;v&#601;">...</phoneme> </speak>';
    msg.lang = 'en';
    speechSynthesis.speak(msg);
    

    This is despite the fact that the Microsoft speech API does handle SSML correctly. Here is a C# snippet, suitable for use in LinqPad:

    var str = "Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream.  The name is pronounced /pævˈloʊvə/ or /pɑːvˈloʊvə/, unlike the name of the dancer, which was /ˈpɑːvləvə/.";
    var regex = new Regex("/([^/]+)/");
    if (regex.IsMatch(str))
    {
        str = regex.Replace(str, "<phoneme alphabet=\"ipa\" ph=\"$1\">word</phoneme>");
        str.Dump();
    }   
    SpeechSynthesizer synth = new SpeechSynthesizer();
    PromptBuilder pb = new PromptBuilder();
    pb.AppendSsmlMarkup(str);
    synth.Speak(pb);
    
    0 讨论(0)
提交回复
热议问题