speech-to-text | 易学教程

How do I set credentials for google speech to text without setting environment variable?

阅读更多关于 How do I set credentials for google speech to text without setting environment variable?

问题 There is C# example client-libraries-usage-csharp of using the lib. And there is example how to set an evironment variable export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json" How do I set credentials for google speech to text without setting environment variable? Somehow like this: var credentials = ...create(file.json); var speech = SpeechClient.Create(credentials); 回答1: using Grpc.Auth; then string keyPath = "key.json"; GoogleCredential googleCredential; using

Assistant Entities and Different Speakers

阅读更多关于 Assistant Entities and Different Speakers

问题 It is possible to differentiate among speakers/users with the Watson-Unity-SDK, as it seems to be able to return an array that identifies which words were spoken by which speakers in a multi-person exchange, but I cannot figure out how to execute it, particularly in the case where I am sending different utterances (spoken by different people) to the Assistant service to get a response accordingly. The code snippets for parsing Assistant's json output/response as well as OnRecognize and

Fetch “transcript” values from Google speech api

阅读更多关于 Fetch “transcript” values from Google speech api

问题 I am trying to fetch the "transcript" value from the following result: { transcript: "1 2 3 4" confidence: 0.902119 words { start_time { nanos: 200000000 } end_time { nanos: 700000000 } word: "1" } words { start_time { nanos: 700000000 } end_time { nanos: 900000000 } word: "2" } words { start_time { nanos: 900000000 } end_time { seconds: 1 } word: "3" } words { start_time { seconds: 1 } end_time { seconds: 1 nanos: 300000000 } word: "4" } } The code I am writing to get it is : for result in

Setting Up PocketSphinx in Mac OS X

阅读更多关于 Setting Up PocketSphinx in Mac OS X

问题 I am running Enthought Python 2.7 as well as default Python 2.7 , Xcode 4.5.1 in Mac OS 10.8.2 . I am trying to develop a speech to text converter in Python . I use Enthought Python as it allows me to record in 16000Hz, 1 Channel using pyaudio , which is needed for pocketsphinx to work. I am trying to setup pocketsphinx using brew install pocketsphinx . I get the following errors Even manual installation using make and using default python results in same errors Using brew doctor, I get How

Pocketsphinx recognizes random phrases in a silence

阅读更多关于 Pocketsphinx recognizes random phrases in a silence

问题 I have a pocketsphinx installed on Raspberry Pi and a microphone connected to it. When i run pocketsphinx_continuous using command pocketsphinx_continuous -inmic yes -dict dict.dict -hmm /home/pi/zero_ru.cd_cont_4000 -jsgf mygrammar.gram it starts to recognize random phrases (but in most cases the same phrase) when I am not speaking. And when I do, result is the same. I use acoustic model for russian language. Please, need your help. 回答1: You need to use keyword spotting mode. Pocketsphinx

Upload a file but set Content-Type

阅读更多关于 Upload a file but set Content-Type

问题 I got Watson Speech-to-Text working on the web. I am now trying to do it on react native but am getting errors on the file upload part. I am using the HTTPS Watson API. I need to set the Content-Type otherwise Watson returns a error response. However in react-native, for the file upload to work, we seem to need to set 'Content-Type' to 'multipart/form-data' . Is there anyway to upload a file in react-native while setting Content-Type to 'audio/aac' ? The error Watson API gives me if I set

Can I use the Web Speech API in a Chrome app?

阅读更多关于 Can I use the Web Speech API in a Chrome app?

问题 Can I use the Web Speech API in a Chrome app? If anyone has any knowledge, please let me know. Thank you 回答1: Chrome Apps have a special TTS API available to them. According to this bug report, the Web Speech API is not available to extensions, but it doesn't say anything about packaged apps. Your best bet is probably to just try it and see if it works. 回答2: According to the Web API section in the chrome apps section: In addition to the chrome.* APIs, extensions can use all the APIs that the

Pocketsphinx Android demo error: “Failed to init recognizer java.io.IOException: Failed to initialize recorder. Microphone might already be in use.”

阅读更多关于 Pocketsphinx Android demo error: “Failed to init recognizer java.io.IOException: Failed to initialize recorder. Microphone might already be in use.”

问题 I have been using Pocketsphinx Android demo and get the error: "Failed to init recognizer java.io.IOException: Failed to initialize recorder. Microphone might already be in use." What does the error mean and what can I do to fix it ? 回答1: If you upgrade your Android OS to 6 or it is already 6, you have to ask permission in runtime. Android manifest recorder permission is not enough after Android OS 6. It will give this error if you do not ask permission. 来源： https://stackoverflow.com

flac: “ERROR: input file has an ID3v2 tag” (it doesn't)

阅读更多关于 flac: “ERROR: input file has an ID3v2 tag” (it doesn't)

问题 I'm trying to build a rather longwinded chain of programs and libraries that culminates in using a speech-to-text API to run an mp3 file into human-readable text. I was surprised to find very few APIs that do this online - the only working thing I found was the speech2text project: https://github.com/taf2/speech2text which hooks into Google's unofficial Speech-To-Text API. This actually worked at first. I did a few manual conversions and was pleased with the results. However, since attempting

How to set speech callback when using SpeakCFString?

阅读更多关于 How to set speech callback when using SpeakCFString?

问题 I’m trying to use the C CoreFoundation interface to the Speech Synthesis Manager. How do you register a speech callback (such as kSpeechSpeechDoneCallBack or kSpeechTextDoneCallBack )? I know how to use the old deprecated SetSpeechInfo function; how do you do it with the new SetSpeechProperty? When I try to use it, it causes “Segmentation fault: 11” instead of calling the function I registered. According to Speech-Channel Properties, I think you’re supposed to pass in a long CFNumberRef whose