Watson STT Java - Varying results between Websockets Java and HTTP POST

不想你离开。 提交于 2019-11-29 15:19:21

问题


I'm trying to build an app that takes a streamed audio input (eg: a line in microphone) and does Speech-to-Text using IBM Bluemix (Watson).

I briefly modified the example Java code found here. This example sends a WAV but instead I'm sending a FLAC... this is [should] be irrelevant.

The results are bad, very bad. This is what I get when using the Java Websockets code:

{
  "result_index": 0,
  "results": [
    {
      "final": true,
      "alternatives": [
        {
          "transcript": "it was six weeks ago today the terror ",
          "confidence": 0.92
        }
      ]
    }
  ]
}

Now, compare the above results with the ones below. These are the results when sending the same thing but instead using cURL (HTTP POST):

{
   "results": [
  {
     "alternatives": [
        {
           "confidence": 0.945,
           "transcript": "it was six weeks ago today the terrorists attacked the U. S. consulate in Benghazi Libya now we've obtained email alerts that were put out by the state department as the attack unfolded as you know four Americans were killed including ambassador Christopher Stevens "
        }
     ],
     "final": true
  },
  {
     "alternatives": [
        {
           "confidence": 0.942,
           "transcript": "sharyl Attkisson has our story "
        }
     ],
     "final": true
  }
   ],
   "result_index": 0
}

That's an almost flawless result.

Why the difference when using Websockets?


回答1:


The issue was fixed in the 3.0.0-RC1 version.

You can get the new jar from:

  1. Maven

    <dependency>
        <groupId>com.ibm.watson.developer_cloud</groupId>
        <artifactId>java-sdk</artifactId>
        <version>3.0.0-RC1</version>
    </dependency>
    
  2. Gradle

    'com.ibm.watson.developer_cloud:java-sdk:3.0.0-RC1'
    
  3. JAR

    Download the jar-with-dependencies(~1.4MB)


Here is an example of how to recognize a flac audio file using WebSockets

SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");

FileInputStream audio = new FileInputStream("path-to-audio-file.flac");

RecognizeOptions options = new RecognizeOptions.Builder()
  .continuous(true)
  .interimResults(true)
  .contentType(HttpMediaType.AUDIO_FLAC)
  .build();

service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() {
  @Override
  public void onTranscription(SpeechResults speechResults) {
    System.out.println(speechResults);
  }
});

}

FLAC file to test: https://s3.amazonaws.com/mozart-company/tmp/4.flac


NOTE: 3.0.0-RC1 is a release candidate. We will do a production release next week (3.0.1).



来源:https://stackoverflow.com/questions/36504879/watson-stt-java-varying-results-between-websockets-java-and-http-post

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!