I\'ve been looking through all StackOverflow in the bazillion of questions about capitalizing a word in Java, and none of them seem to care the least about internationalizat
The only two character digraph in which both characters are capitalized at once and that you probably will encounter in a real life program is the Dutch IJ. Just handle it if the locale is Dutch. In the worst improbable scenario, there will be 1-2 cases that you'll need to add later, it is not that you'll encounter new capitalization digraph every day so it is not worth focusing on generalization here.
Note, in general, it is not possible to use character to character conversion to get either title or upper case for an arbitrary language. Some lower case characters translate to more than one upper case characters. So you have to use String in a generic case.
But there is no any problem with title case locale. There is probably a small misunderstanding about how toTitleCase() method works. It will convert to title case any character, including one that is already in the upper case.
For example, consider the dž character. It's upper case form is DŽ and the title case form is Dž:
System.out.println(Character.toUpperCase('\u01C4'));
DŽ
and
System.out.println(Character.toTitleCase('\u01C4'));
Dž
however, the following will also give title case
System.out.println(Character.toTitleCase(Character.toUpperCase('\u01C4')));
Dž
So, if you convert with locale to upper case before title case, you get the correct code point and there is no problem to use title case on the result, including Turkish, etc.:
System.out.println(Character.toTitleCase("dž".toUpperCase().charAt(0)));
System.out.println(Character.toTitleCase("i".toUpperCase(Locale.forLanguageTag("tr")).charAt(0)));
Dž
İ
Note, just using title case of a single character if there is a difference from its upper case is not correct in a generic case.
To summarize:
Note, there are still some capitalization cases that are context aware, like Irish prefix, English ff names, etc. which require more than just a character/string processing, but I doubt you need to handle them for title generation in a program.
The problem is that the differentiation of upper and lower case letters is very language specific. So many, maybe most languages, do not have such.
Anyway, there is a Unicode faq: http://www.unicode.org/faq/casemap_charprop.html
..and I guess there is a Unicode specific mapping table somewhere (something like that ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt). So its probably best to use your own conversion method.
Like you, I was unable to find a suitable method in the core Java API.
However, there does seem to be a locale-sensitive string-title-case method (UCharacter#toTitleCase) in the ICU library.
Looking at the source for the relevant ICU methods (UCharacter#toTitleCase and UCaseProps#toUpperOrTitle), there don't seem to be many locale-specific special cases for title-casing, so you might be able to get away with the following: