AWS API to get Alexa voice

拥有回忆 提交于 2019-12-12 15:11:43

问题


Does anyone know if there is an AWS API or similar that would allow me to send in text (or SSML), and get back the audio of Alexa 'speaking' it.
Crucially, I want the output in Alexa's 'voice'

The options I have explored so far are:

  • AWS Polly
    This was my first port of call and sounds promising, and simple to interact with; but the available voices do not include Alexa's voice (I think the GB voice for Alexa is 'Abbey')
    If I didn't need the voice to be Alexa's, I'd probably be using this idea
  • A simple Alexa skill lambda and API gateway
    My thinking is along the lines of a simple Alexa lambda with an intent that has one slot, where the SpeechletResponse has an OutputSpeech containing the value of the slot. An AWS API Gateway configured to invoke the lambda and return the result.
    I've not tried this, but I'm guessing the result through the API Gateway back to my client will be a json representation of the SpeechletResponse rather than an audio stream.
  • Something using AVS
    I'm currently hacking around with this idea, specifically the javaclient part of the alexa-avs-sample-app but I don't know if I'm barking up the wrong tree
    I've created an AVS product, and I've configured the javaclient to talk to it.
    At the moment it's interaction is based on sampling an audio stream from my mic to an audio stream, and sending that to AVS (ie. as if I'd have spoken to my Echo)
    So I could say "tell mySkill to say 'hello world'" and it will speak "hello world"
    But that's not quite what I want - I don't want to speak to anything, I want to programmatically call an API with some text to get the spoken audio stream.

A similar question has already been asked, but at the moment there are no answers, and I think I've added more detail/analysis to my particular problem.


In response to one of the comments, I'll try to describe my specific use case for wanting the Alexa voice:

When developing an Alexa skill, you construct and populate the OutputSpeech in code within your lambda function. It is not possible to hear what the spoken output will sound like until you have deployed your lambda, and either test on a real device, or use the Voice Simulator section of the Test tab of the developer portal.

The problem I am trying to solve is that of creating good sounding spoken responses for Alexa skills, without the trial and error approaches described above. Deploying and using a real device is obviously long winded. Using the Voice Simulator is better, but it is limited in that you have a very small field to operate with (not good if you have a long sentence or paragraph that you would like spoken), and adding SSML to enhance the spoken output is not a great UX and workflow.

I was looking at creating something that improved this UX and workflow, but the core requirement behind it is to hear Alexa's voice. Yes, of course I could use Polly, but if this use case is around making Alexa skills easier to write, then hearing another voice is not much use, and arguably is misleading because the way different voices pronounce different words and punctuation is different, so you might need to add SSML phonetics for some words for one voice, but not for another.


回答1:


I think you are looking for this:

On your Alexa's developer console under the "Test" tab, you'll see three tabs "Alexa Simulator","Manual JSON" and "Voice & Tone", the third one is the one which you are asking I believe, here you'll see tag and you can use ssml also:



来源:https://stackoverflow.com/questions/47876250/aws-api-to-get-alexa-voice

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!