How to determine a string is english or arabic?

后端 未结 8 1172
我寻月下人不归
我寻月下人不归 2021-01-31 17:47

Is there a way to determine a string is English or Arabic?

相关标签:
8条回答
  • 2021-01-31 18:20

    A minor change to cover all arabic characters and symbols range

    private boolean isArabic(String text){
            String textWithoutSpace = text.trim().replaceAll(" ",""); //to ignore whitepace
            for (int i = 0; i < textWithoutSpace.length();) {
                int c = textWithoutSpace.codePointAt(i);
              //range of arabic chars/symbols is from 0x0600 to 0x06ff
                //the arabic letter 'لا' is special case having the range from 0xFE70 to 0xFEFF
                if (c >= 0x0600 && c <=0x06FF || (c >= 0xFE70 && c<=0xFEFF)) 
                    i += Character.charCount(c);   
                else                
                    return false;
    
            } 
            return true;
          }
    
    0 讨论(0)
  • 2021-01-31 18:27

    You can usually tell by the code points within the string itself. Arabic occupies certain blocks in the Unicode code space.

    It's a fairly safe bet that, if a substantial proportion of the characters exist in those blocks (such as بلدي الحوامات مليء الثعابينة), it's Arabic text.

    0 讨论(0)
提交回复
热议问题