问题
Does anyone know if there is an AWS API or similar that would allow me to send in text (or SSML), and get back the audio of Alexa 'speaking' it.
Crucially, I want the output in Alexa's 'voice'
The options I have explored so far are:
- AWS Polly
This was my first port of call and sounds promising, and simple to interact with; but the available voices do not include Alexa's voice (I think the GB voice for Alexa is 'Abbey')
If I didn't need the voice to be Alexa's, I'd probably be using this idea - A simple Alexa skill lambda and API gateway
My thinking is along the lines of a simple Alexa lambda with an intent that has one slot, where theSpeechletResponse
has anOutputSpeech
containing the value of the slot. An AWS API Gateway configured to invoke the lambda and return the result.
I've not tried this, but I'm guessing the result through the API Gateway back to my client will be a json representation of theSpeechletResponse
rather than an audio stream. - Something using AVS
I'm currently hacking around with this idea, specifically thejavaclient
part of thealexa-avs-sample-app
but I don't know if I'm barking up the wrong tree
I've created an AVS product, and I've configured thejavaclient
to talk to it.
At the moment it's interaction is based on sampling an audio stream from my mic to an audio stream, and sending that to AVS (ie. as if I'd have spoken to my Echo)
So I could say "tell mySkill to say 'hello world'" and it will speak "hello world"
But that's not quite what I want - I don't want to speak to anything, I want to programmatically call an API with some text to get the spoken audio stream.
A similar question has already been asked, but at the moment there are no answers, and I think I've added more detail/analysis to my particular problem.
In response to one of the comments, I'll try to describe my specific use case for wanting the Alexa voice:
When developing an Alexa skill, you construct and populate the OutputSpeech
in code within your lambda function. It is not possible to hear what the spoken output will sound like until you have deployed your lambda, and either test on a real device, or use the Voice Simulator section of the Test tab of the developer portal.
The problem I am trying to solve is that of creating good sounding spoken responses for Alexa skills, without the trial and error approaches described above. Deploying and using a real device is obviously long winded. Using the Voice Simulator is better, but it is limited in that you have a very small field to operate with (not good if you have a long sentence or paragraph that you would like spoken), and adding SSML to enhance the spoken output is not a great UX and workflow.
I was looking at creating something that improved this UX and workflow, but the core requirement behind it is to hear Alexa's voice. Yes, of course I could use Polly, but if this use case is around making Alexa skills easier to write, then hearing another voice is not much use, and arguably is misleading because the way different voices pronounce different words and punctuation is different, so you might need to add SSML phonetics for some words for one voice, but not for another.
回答1:
I think you are looking for this:
On your Alexa's developer console under the "Test" tab, you'll see three tabs "Alexa Simulator","Manual JSON" and "Voice & Tone", the third one is the one which you are asking I believe, here you'll see tag and you can use ssml also:
来源:https://stackoverflow.com/questions/47876250/aws-api-to-get-alexa-voice