How do I convert an Arabic String date to a java 8 date object?

随声附和 提交于 2019-12-05 02:17:32
Ole V.V.

Edit: with thanks to slim and Meno Hochschild for inspiration:

String dateTimeString = "الاثنين 24 أبريل 2017 - 15:00";

DateTimeFormatter formatter
        = DateTimeFormatter.ofPattern("EEEE d MMMM uuuu - HH:mm", new Locale("ar"));
LocalDateTime dateTime = LocalDateTime.parse(dateTimeString, formatter);
System.out.println(dateTime);

This prints:

2017-04-24T15:00

The answers of @Ole and @slim are working, but not for the reason they think.

First observation - the nu-extension is unnecessary for given example:

Oles suggestion would also work for the locale new Locale("ar", "SA") instead of Locale.forLanguageTag("ar-SA-u-nu-arab"). So what does the unicode-nu-extension here? Nothing. Next question:

What is the nu-extension supposed to do here?

The nu-code-word "arab" is specified by the unicode consortium to yield arabic-indic digits. But the input to be parsed does only have western digits 0-9 (which are historically overtaken from Arab people and specified as code word "latn" - a misnomer by the way). So if the nu-extension had really done its job here then parsing should have failed because arabic-indic digits are not 0-9 but:

٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩

Obviously, the nu-extension is not supported in general by new time-API in Java-8.

Does SimpleDateFormat support the nu-extension?

Using debugging of following code, I discovered that the nu-extension is only supported for Thai-numerals (see also official javadoc of class java.util.Locale but not for arabic-indic digits:

SimpleDateFormat sdf = 
    new SimpleDateFormat("EEEE d MMMM yyyy - HH:mm", Locale.forLanguageTag("ar-SA-nu-arab"));
Date d = sdf.parse(dateTimeString);
System.out.println(d);
String formatted = sdf.format(d);
System.out.println(formatted);
System.out.println(sdf.format(d).equals(dateTimeString));

sdf = new SimpleDateFormat("EEEE d MMMM uuuu - HH:mm", Locale.forLanguageTag("ar-SA-u-nu-thai"));
String thai = sdf.format(d);
System.out.println("u-nu-thai: " + thai);

I assume the class DateTimeFormatter of Java-8 also supports Thai numerals.

Conclusion:

Forget the nu-extension. Just construct the locale via the old-fashioned way without unicode extension and adapt Oles answer this way. It works because your input only has western digits 0-9.

For extensive i18n-support including the nu-extension for various numbering systems (if you have such input), you might consider external libraries (for example ICU4J or my lib Time4J).

slim

I don't know enough Arabic to understand an Arabic formatted date. However this code:

Locale arabicLocale = new Locale.Builder().setLanguageTag("ar-SA-u-nu-arab").build();

LocalDate date = LocalDate.now();
DateTimeFormatter formatter = DateTimeFormatter.ofLocalizedDate(FormatStyle.FULL).withLocale(arabicLocale);

String formatted = date.format(formatter);
System.out.println(formatted);
System.out.println(formatter.parse(formatted));

Yields this output:

26 أبريل, 2017
{},ISO resolved to 2017-04-26

The code to create the Locale is from an answer to Setting Arabic numbering system locale doesn't show Arabic numbers

You can fine-tune this format by defining your own FormatStyle.

You have to specify the charset when parsing the string, assuming that the date you want to parse will always be in the format you provided this would work :

public static Date getDate(String strDate) throws Exception{
    strDate=new String(strDate.getBytes(),"UTF-8");

    Map<String, Integer> months = new HashMap<>();

    String JAN =  new String("يناير".getBytes(), "UTF-8");
    String FEB =  new String("فبراير".getBytes(), "UTF-8");
    String MAR =  new String("مارس".getBytes(), "UTF-8");
    String APR =  new String("أبريل".getBytes(), "UTF-8");
    String APR_bis =  new String("ابريل".getBytes(), "UTF-8");
    String MAY =  new String("ماي".getBytes(), "UTF-8");
    String JUN =  new String("بونيو".getBytes(), "UTF-8");
    String JUN_bis =  new String("يونيه".getBytes(), "UTF-8");
    String JUL =  new String("يوليوز".getBytes(), "UTF-8");
    String AUG =  new String("غشت".getBytes(), "UTF-8");
    String SEP =  new String("شتنبر".getBytes(), "UTF-8");
    String SEP_bis =  new String("سبتمبر".getBytes(), "UTF-8");
    String OCT =  new String("أكتوبر".getBytes(), "UTF-8");
    String OCT_bis =  new String("اكتوبر".getBytes(), "UTF-8");
    String NOV =  new String("نونبر".getBytes(), "UTF-8");
    String NOV_bis =  new String("نوفمبر".getBytes(), "UTF-8");
    String DEC =  new String("دجنبر".getBytes(), "UTF-8");
    String DEC_bis =  new String("ديسمبر".getBytes(), "UTF-8");



    months.put(JAN, 0);
    months.put(FEB, 1);
    months.put(MAR, 2);
    months.put(APR, 3);
    months.put(APR_bis, 3);
    months.put(MAY, 4);
    months.put(JUN, 5);
    months.put(JUN_bis, 5);
    months.put(JUL, 6);
    months.put(AUG, 7);
    months.put(SEP, 8);
    months.put(SEP_bis, 8);
    months.put(OCT, 9);
    months.put(OCT_bis, 9);
    months.put(NOV, 10);
    months.put(NOV_bis, 10);
    months.put(DEC, 11);
    months.put(DEC_bis, 11);


    StringTokenizer stringTokenizer = new StringTokenizer(strDate);

    Calendar calendar = Calendar.getInstance();


    while(stringTokenizer.hasMoreElements()) {

        stringTokenizer.nextElement();// to skip the first string which is the name of the day

        int day = Integer.parseInt(stringTokenizer.nextElement().toString().trim());

        String strMonth = stringTokenizer.nextElement().toString().trim();

        int month = months.get(strMonth);

        int year = Integer.parseInt(stringTokenizer.nextElement().toString().trim());

        calendar.set(year, month, day);


    }
    return calendar.getTime();

}

it gives this output :

  Fri Oct 20 15:26:47 WEST 2017
Obenland

One solution could be to translate the date to English and parse it then:

private final static Map<String, Integer> monthMapping = new HashMap<>();
static {
    // list of all month.
    monthMapping.put("أبريل", "4");
}


public Date fromArabicToDate(String arabicInput) throws ParseException {
    String[] parts = arabicInput.split(" ");
    if (parts.length != 4) 
        throw new IllegalArgumentException();

    String dateInput = parts[0] + "-" + monthMapping.get(parts[1]) + "-" + parts[2];
    SimpleDateFormat parser = new SimpleDateFormat("YYYY-MM-DD");
    return parser.parse(dateInput);
}

I tried to copy the month but I don't believe I have done it correctly. The arguments of put get switched when parsing.

Or you have a look at Joda-Time. Maybe they have a solution. It was mentioned here.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!