speech-to-text

Watson Speech-to-Text register_callback returns only 400s

杀马特。学长 韩版系。学妹 提交于 2019-12-11 07:34:32
问题 The Watson Speech-to-Text asynchronous HTTP interface allows one to register a callback url through a call to register_callback . This call is clearly not working; for illustration, please see these six lines of code. # Illustration of how I can't get the Watson Speech-to-Text # register_callback call to work. r = requests.post( "https://stream.watsonplatform.net/speech-to-text/api/v1/register_callback?{0}".format( urllib.urlencode({ "callback_url": callback_url })), auth=(watson_username,

Java speech API null response

痴心易碎 提交于 2019-12-11 06:53:07
问题 I am using the java speech recognition API - Jarvis located at https://github.com/lkuza2/java-speech-api However when I run my application, I get an error : Server returned HTTP response code: 400 for URL: https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US&maxresults=1 (This is the URL that this api uses to get response from Google) I also created a API key as mentioned in the earlier posts and tried to use the url (this is version 2 API): www.google.com/speech

Difference in word confidence in IBM Watson Speech to text

南笙酒味 提交于 2019-12-11 06:26:23
问题 I am using the node sdk to use the IBM watson speech-to-text module. After sending the audio sample and receiving a response, the confidence factor looks weird. { "results": [ { "word_alternatives": [ { "start_time": 3.31, "alternatives": [ { "confidence": 0.7563, "word": "you" }, { "confidence": 0.0254, "word": "look" }, { "confidence": 0.0142, "word": "Lou" }, { "confidence": 0.0118, "word": "we" } ], "end_time": 3.43 }, ... and ... ], "alternatives": [ { "word_confidence": [ [ "you", 0

Swift 3.0 Speech to Text: Changing Color of Words

瘦欲@ 提交于 2019-12-11 04:48:56
问题 I'm trying to change the color of words in textfield set by spoken words (ex: happy, sad, angry, etc). It doesn't work if the word is spoken more than once. For example, if I say, "I'm feeling happy because my cat is being nice to me. My brother is making me sad. I'm happy again." it will only change the color of the first 'happy' and I'm not exactly sure why. func setTextColor(text: String) -> NSMutableAttributedString { let string:NSMutableAttributedString = NSMutableAttributedString(string

What audio formats are supported by Azure Cognitive Services' Speech Service (SST)?

风格不统一 提交于 2019-12-11 03:08:13
问题 Bearing in mind that the Microsoft/Azure Cognitive Services' "Speech Service" is currently going through a rationalisation exercise, as far as I can tell from looking at https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#speech-to-text https://docs.microsoft.com/en-us/azure/cognitive-services/speech/home only .wav binaries are acceptable, with anything else giving the response: {"Message":"Unsupported audio format"} Is there any other way to discover the

Azure Speech To Text: Conversation Transcribing userid always return $ref$

喜欢而已 提交于 2019-12-11 01:07:42
问题 Using sample code to transcribe conversation, but on recognized event i always get $ref$ when calling e.Result.UserId . I use 16-bit samples, 16 kHz sample rate, and a single channel (Mono) format for voice signatures. And 32-bit samples, 32 kHz sample rate, and a single channel (Mono) format for Transcribing conversations. All code from: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-conversation-transcription-service Is there any ideas? or .wav sample

How to extract the values that return from the createRecognizeStream() method?

走远了吗. 提交于 2019-12-11 00:42:58
问题 Using Watson Speech to Text Services How to extract the values that return from the createRecognizeStream() method? Here is a chunk of the sample code. I am trying to see in the terminal the interim results but all i get is this. How do I set the options for the results to appear? { results: [ { alternatives: [Object], final: false } ], result_index: 0 } { results: [ { alternatives: [Object], final: false } ], result_index: 0 } { results: [ { alternatives: [Object], final: false } ]... they

AndroidPocketSphinx: How does the system know which recognizer is invoked?

爷,独闯天下 提交于 2019-12-10 17:34:37
问题 I am studying the source code of TestPocketSphinxAndAndroidASR.java and the first thing that is not so clear to me is how the system knows which recognizer (i.e. Google or CMUSphinx) to invoke. I can see that the recognition activity is started by: Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE); but as far as I

Google speech API - php does not return anything

Deadly 提交于 2019-12-10 17:23:05
问题 My code is inspired by this php version of full duplex google speech API for speech-to-text : http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/ I have few flac files that do work and give the array output as explained on Mike's post. But for few flac files it just doesn't return anything as output. For example : http://gavyadhar.com/video/upload/Pantry_Survey.flac, no output is returned. But the same code works for this flac file: http://gavyadhar.com/video/upload/pantry

Improve Android speech recognition with additional context

喜夏-厌秋 提交于 2019-12-10 17:00:51
问题 As I understand Android API uses google speech recognition service for speech to text. I've learned API and I see it's pretty simple and just converts voice to words array. Is any way to improve the recognition, I mean, if I know the context can I send some parameters to the service in order to improve the recognition? Or alternatively is any other speech recognition service which can be used for this purpose? Thanks in advance. 回答1: Is any way to improve the recognition, I mean, if I know