It is quite easy to format and parse Java Date (or Calendar) classes using instance of DateFormat, i.e. I could format current date into short localize date like this:
Please find in the below code which accepts the locale instance and returns the locale specific data format/pattern.
public static String getLocaleDatePattern(Locale locale) {
// Validating if Locale instance is null
if (locale == null || locale.getLanguage() == null) {
return "MM/dd/yyyy";
}
// Fetching the locale specific date pattern
String localeDatePattern = ((SimpleDateFormat) DateFormat.getDateInstance(
DateFormat.SHORT, locale)).toPattern();
// Validating if locale type is having language code for Chinese and country
// code for (Hong Kong) with Date Format as - yy'?'M'?'d'?'
if (locale.toString().equalsIgnoreCase("zh_hk")) {
// Expected application Date Format for Chinese (Hong Kong) locale type
return "yyyy'MM'dd";
}
// Replacing all d|m|y OR Gy with dd|MM|yyyy as per the locale date pattern
localeDatePattern = localeDatePattern.replaceAll("d{1,2}", "dd").replaceAll(
"M{1,2}", "MM").replaceAll("y{1,4}|Gy", "yyyy");
// Replacing all blank spaces in the locale date pattern
localeDatePattern = localeDatePattern.replace(" ", "");
// Validating the date pattern length to remove any extract characters
if (localeDatePattern.length() > 10) {
// Keeping the standard length as expected by the application
localeDatePattern = localeDatePattern.substring(0, 10);
}
return localeDatePattern;
}
Since it's just the locale information you're after, I think what you'll have to do is locate the file which the JVM (OpenJDK or Harmony) actually uses as input to the whole Locale
thing and figure out how to parse it. Or just use another source on the web (surely there's a list somewhere). That'll save those poor translators.
For those still using Java 7 and older:
You can use something like this:
DateFormat formatter = DateFormat.getDateInstance(DateFormat.SHORT, Locale.getDefault());
String pattern = ((SimpleDateFormat)formatter).toPattern();
String localPattern = ((SimpleDateFormat)formatter).toLocalizedPattern();
Since the DateFormat
returned From getDateInstance()
is instance of SimpleDateFormat
.
Those two methods should really be in the DateFormat too for this to be less hacky, but they currently are not.
It may be strange, that I am answering my own question, but I believe, I can add something to the picture.
Obviously, Java 8 gives you a lot, but there is also something else: ICU4J. This is actually the source of Java original implementation of things like Calendar
, DateFormat
and SimpleDateFormat
, to name a few.
Therefore, it should not be a surprise that ICU's SimpleDateFormat also contains methods like toPattern()
or toLocalizedPattern()
. You can see them in action here:
DateFormat fmt = DateFormat.getPatternInstance(
DateFormat.YEAR_MONTH,
Locale.forLanguageTag("pl-PL"));
if (fmt instanceof SimpleDateFormat) {
SimpleDateFormat sfmt = (SimpleDateFormat) fmt;
String pattern = sfmt.toPattern();
String localizedPattern = sfmt.toLocalizedPattern();
System.out.println(pattern);
System.out.println(localizedPattern);
}
This is nothing new, but what I really wanted to point out is this:
DateFormat.getPatternInstance(String pattern, Locale locale);
This is a method that can return a whole bunch of locale specific patterns, such as:
Sure, there are quite a few. What is good about them, is that these patterns are actually strings (as in java.lang.String
), that is if you use English pattern "MM/d"
, you'll get locale-specific pattern in return. It might be useful in some corner cases. Usually you would just use DateFormat
instance, and won't care about the pattern itself.
The question intention was to get localized, and not the locale-specific pattern. What's the difference?
In theory, toPattern()
will give you locale-specific pattern (depending on Locale
you used to instantiate (Simple)DateFormat
). That is, no matter what target language/country you put, you'll get the pattern composed of symbols like y
, M
, d
, h
, H
, M
, etc.
On the other hand, toLocalizedPattern()
should return localized pattern, that is something that is suitable for end users to read and understand. For instance, German middle (default) date pattern would be:
The intention of the question was: "how to find the localized pattern that could serve as hint as to what the date/time format is". That is, say we have a date field that user can fill-out using the locale-specific pattern, but I want to display a format hint in the localized form.
Sadly, so far there is no good solution. The ICU I mentioned earlier in this post, partially works. That's because, the data that ICU uses come from CLDR, which is unfortunately partially translated/partially correct. In case of my mother's tongue, at the time of writing, neither patterns, nor their localized forms are correctly translated. And every time I correct them, I got outvoted by other people, who do not necessary live in Poland, nor speak Polish language...
The moral of this story: do not fully rely on CLDR. You still need to have local auditors/linguistic reviewers.