Speaker Diarization from Audio file Android

此生再无相见时 提交于 2020-01-06 05:25:16

问题


How to separating different speakers from the audio file in android? Google Cloud Speech API? (https://cloud.google.com/speech-to-text/docs/multiple-voices#speech-diarization-java)

Possible dublicates of Speaker Diarization support in Google Speech API

I have tried the demo of google cloud speech to text api but unable to get success for same, Please check below error log from logcat.

Code:

val content = latestAudioFile?.readBytes()
                    val inputStream = this.getAssets().open("XXXXX-6e000f81XXXX.json")
                    val credentials = GoogleCredentials.fromStream(inputStream)
                        .createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"))

                    val credentialsProvider = FixedCredentialsProvider.create(credentials)
                    val speechSettings =
                        SpeechSettings.newBuilder().setCredentialsProvider(credentialsProvider)
                            .build();

                    SpeechClient.create(speechSettings).use { speechClient ->

                        val recognitionAudio =
                            RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content))
                                .build()
                        val speakerDiarizationConfig = SpeakerDiarizationConfig.newBuilder()
                            .setEnableSpeakerDiarization(true)
                            .setMinSpeakerCount(1)
                            .setMaxSpeakerCount(4)
                            .build()

                        val config = RecognitionConfig.newBuilder()
                            .setEncoding(RecognitionConfig.AudioEncoding.LINEAR16)
                            .setLanguageCode("en-US")
                            .setSampleRateHertz(8000)
                            .setDiarizationConfig(speakerDiarizationConfig)
                            .build()

                        val recognizeResponse = speechClient.recognize(config, recognitionAudio);

                        val alternative = recognizeResponse.getResults(
                            0 /*recognizeResponse.getResultsCount() - 1*/
                        ).getAlternatives(0)

                        var wordInfo = alternative.getWords(0)
                        var currentSpeakerTag = wordInfo.getSpeakerTag()

                        val speakerWords = StringBuilder(
                            String.format(
                                "Speaker %d: %s",
                                wordInfo.getSpeakerTag(),
                                wordInfo.getWord()
                            )
                        )
                        for (i in 1 until alternative.getWordsCount()) {
                            wordInfo = alternative.getWords(i)
                            if (currentSpeakerTag == wordInfo.getSpeakerTag()) {
                                speakerWords.append(" ")
                                speakerWords.append(wordInfo.getWord())
                            } else {
                                speakerWords.append(
                                    String.format(
                                        "\nSpeaker %d: %s",
                                        wordInfo.getSpeakerTag(),
                                        wordInfo.getWord()
                                    )
                                )
                                currentSpeakerTag = wordInfo.getSpeakerTag()
                            }
                        }

Error log: https://justpaste.it/1xgz4

Help and suggestions of different ways to achieve the Speaker Diarizations are accepted.

来源:https://stackoverflow.com/questions/59437181/speaker-diarization-from-audio-file-android

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!