问题
I am trying to design this simple virtual assistant. I am very new to all of this.
import json
from symbol import parameters
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson import ApiException
import speech_recognition as sr
try:
() # Invoke a Text to Speech method
except ApiException as ex:
print("Method failed with status code " + str(ex.code) + ": " + ex.message)
authenticator = IAMAuthenticator("(API KEY)")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url(
'https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/113cd664-f07b-44fe-a11d-a46cc50caf84')
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav'
).get_result().content)
I have just started using IBM Apis. Using the code provided above, how could I use IBM's speech to text api with Python and input a question with my voice and get a different response from the IBM voice depending on the input of my voice? Right now, the inputted text where the 'hello world' is, that's what gets sent to IBM and then using the selected Allison Voice, it saves a Wav file with the text being read by Allison. It is more of a virtual assistant than anything. I would be interested in Python examples. I am not so great with indents in python.
This was my attempt at trying to combine the code from Text to speech and Speech to Text IBM APIs. This was based on my older Python text to speech api assistant.
import json
import os
from symbol import parameters
import wikipedia
import pyjokes
from ibm_watson import TextToSpeechV1
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson import ApiException
try:
() # Invoke a Text to Speech method
except ApiException as ex:
print("Method failed with status code " + str(ex.code) + ": " + ex.message)
authenticator = IAMAuthenticator("API Key")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
authenticator = IAMAuthenticator('API KEY')
speech_to_text = SpeechToTextV1(
authenticator=authenticator
)
voice='en-US_AllisonVoice'
speech_to_text.set_service_url('https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/7393db4a-82d8-40f8-a86d-09cb948589e2')
text_to_speech.set_service_url('https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/113cd664-f07b-44fe-a11d-a46cc50caf84')
def takeCommand(recognize_using_websocket)
with dict recognize_using_websocket.Microphone() as source:
print("Listening...")
r.pause_threshold = .5
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-us')
print("User said: {query}\n")
except Exception as e:
print(e)
text_to_speech.synthesize("I can't hear you sir.")
print("I can't hear you sir.")
return "None"
return query
if __name__ == '__main__':
clear = lambda: os.system('cls')
# This Function will clean any
# command before execution of this python file
clear()
while True:
query = takeCommand().lower()
# All the commands said by user will be
# stored here in 'query' and will be
# converted to lower case for easily
# recognition of command
if 'wikipedia' in query:
text_to_speech.synthesize('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=3)
text_to_speech.synthesize("According to Wikipedia")
print(results)
text_to_speech.synthesize(results)
elif "who made you" in query or "who created you" in query:
text_to_speech.synthesize("I have been created by you sir.")
elif 'tell me a joke' in query or "make me laugh" in query:
text_to_speech.synthesize(pyjokes.get_joke()
I was trying to make a simple ask and answer system here. Some are just ask and website pops up like Wikipedia.
来源:https://stackoverflow.com/questions/61489950/i-have-just-started-using-ibm-cloud-i-am-trying-to-get-a-different-output-from