Watson speech to text live stream C# code example

烂漫一生 提交于 2019-12-13 17:15:50

问题


I'm trying to build an app in C# that will take an audio stream (from a file for now, but later it will be a web stream) and return transcriptions from Watson in real time as they become available, similar to the demo at https://speech-to-text-demo.mybluemix.net/

Does anyone know where I can find some sample code, preferably in C#, that could help me get started?

I tried this, based on the limited documentation at https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1, but I get a BadRequest error when I call RecognizeWithSession. I'm not sure if I'm on the right path here.

    static void StreamingRecognize(string filePath)
    {
        SpeechToTextService _speechToText = new SpeechToTextService();
        _speechToText.SetCredential(<user>, <pw>);
        var session = _speechToText.CreateSession("en-US_BroadbandModel");

        //returns initialized
        var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);

        //  set up observe
        var taskObserveResult = Task.Factory.StartNew(() =>
        {
            var result = _speechToText.ObserveResult(session.SessionId);
            return result;
        });

        //  get results
        taskObserveResult.ContinueWith((antecedent) =>
        {
            var results = antecedent.Result;
        });

        var metadata = new Metadata();
        metadata.PartContentType = "audio/wav";
        metadata.DataPartsCount = 1;
        metadata.Continuous = true;
        metadata.InactivityTimeout = -1;
        var taskRecognizeWithSession = Task.Factory.StartNew(() =>
        {
            using (FileStream fs = File.OpenRead(filePath))
            {
                _speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
            }
        });
    }

回答1:


Inside the Watson Developer Cloud - SDK's, in your programming language, you can see one folder called Examples, and you can access the example for using Speech to Text.

The SDK has support for WebSockets which would satisfy your requirement of transcribing more real-time versus uploading an audio file.

static void Main(string[] args)
        {
            Transcribe();
            Console.WriteLine("Press any key to exit");
            Console.ReadLine();
        }

        // http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
        static String username = "<username>";
        static String password = "<password>";

        static String file = @"c:\audio.wav";

        static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");

        // these should probably be private classes that use DataContractJsonSerializer 
        // see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
        // or the ServiceState class at the end
        static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
            "{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
        ));
        static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
            "{\"action\": \"stop\"}"
        ));
        // ... more in the link below
  • Access the SDK C# here.
  • See the API reference for more information here.
  • One full example using Speech to Text by IBM Watson Developer here.


来源:https://stackoverflow.com/questions/46179447/watson-speech-to-text-live-stream-c-sharp-code-example

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!