iOS text to speech: What decides the default voice returned by [AVSpeechSynthesisVoice voiceWithLanguage]?

AVSpeechSynthesisVoice.voiceWithLanguage has been introduced in iOS SDK 7.0. At that time, there is only one voice per language/locale.

Since iOS SDK 9.0, more voices have been added for each language/locale. So Apple introduces an new API voiceWithIdentifier so you can get the specific voice you want.

My question here is, what if we still use voiceWithLanguage in iOS 9 or above. What does this API exactly returns? And more importantly, does the returned voice changed between iOS versions and even between different devices?

I've noticed that, what voiceWithLanguage returns is kind of relying on the iOS speech settings "ettings -> General -> Accessibility -> Speech -> Voices -> English". But not really exact match. That is to say, for example English US, if you set voice "Fred" voiceWithLanguage will return "Fred", which is cool. But if you set voice to "Nicky", voiceWithLanguage returns something else other than "Nicky".

I'm asking this is because my application is using voiceWithLanguage. And while user upgraded iOS to iOS 12, they reported that they heard a difference voice. I believe voiceWithLanguage is returning a different voice after upgrading to iOS 12. While I can't reproduce it on the same type of devices.

And of course I can start to use voiceWithIdentifier instead. But just curious about this voiceWithLanguage.

[NO Solution yet] I met the same issue, too.

First, set different voice via

Accessibility -> Speech -> Voices
or 
Accessibility -> VoiceOver -> Speech -> Voices

Then

AVSpeechSynthesisVoice(language: language)

It won't affect AVSpeechSynthesizer in iOS 12.0.1 (it works on iOS 11.x & iOS 12.0.0)

I also found two other things on iOS 12.0.1

First, the Siri voice is no longer available when using

AVSpeechSynthesisVoice(identifier: "com.apple.ttsbundle.siri_male_ja-JP_compact")

Second, if I don't set voice identifier, Female Siri in that locale will speak the text. It also doesn't affect by the speech settings in preference. BTW, I still cannot find anyway to make Male Siri Speak anything, yet... haha

... does the returned voice changed between iOS versions and even between different devices?

I discovered speech synthesis only in iOS 12 so I can't give you any information about the previous versions but I understood that the default voice is the built-in voice of the device supported language.

As you use only the BCP 47 code specifying language and locale for a voice when instantiating the AVSpeechSynthesisVoice class, your code takes the default voice of the device that may be customed for many users.

... for example English US, if you set voice "Fred" voiceWithLanguage will return "Fred", which is cool. But if you set voice to "Nicky", voiceWithLanguage returns something else other than "Nicky".

I made many tests (iOS 12.3.1, Swift 5.0, iPhone X, iPhone 7 Plus) including the one you mentioned and it always return the built-in voice of my device supported language when I change it.

I couldn't reproduce your problem.

... of course I can start to use voiceWithIdentifier instead.

That's exactly what I recommend and, if the specified voice with the identifier isn't installed, take the default one: it will reduce the possible different voices heard by many users.

To conclude, using the BCP 47 code ("en-US", "fr-FR"...) gives rise to the built-in voice of the device to be taken into account that could lead to different heard voices according to the customed settings: that decides the default voice returned by [AVSpeechSynthesisVoice voiceWithLanguage] (ObjC).

An introduction of choosing the right voice is available at this WWDC detailed summary if need be.

来源：https://stackoverflow.com/questions/52727301/ios-text-to-speech-what-decides-the-default-voice-returned-by-avspeechsynthesi

标签

ios

text-to-speech

avspeechsynthesizer