Building on @tobias_k's comment, here is code that will find any month in a date string where the French short month abbreviation is expected to end with a period but doesn't, and replace it with the correct abbreviation including the period.
import java.util.Locale;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.text.DateFormatSymbols;
public String fixFrenchMonths(String date) {
for (String mois : DateFormatSymbols
.getInstance(Locale.FRENCH).getShortMonths()) {
if (mois.endsWith(".")) {
Pattern sansDot = Pattern.compile("(" +
Pattern.quote(mois.substring(0, mois.length()-1)) +
"(?!\\.))");
Matcher matcher = sansDot.matcher(date);
if (matcher.find()) {
date = matcher.replaceFirst(mois);
}
}
}
return date;
}
Note: "mois" is French for "month", and "sansDot" means "withoutDot". That may be a trifle too clever, perhaps. It uses a zero-width negative lookahead to make sure it doesn't replace an abbreviation that already contains a dot. It also uses Pattern.quote
on the data from DateFormatSymbols
. This is probably overkill, since we don't expect that to include any characters that are regex metacharacters (except the dot itself, which we strip off), but it's probably better safe than sorry when passing data from some place we don't control into Pattern.compile
.