Ignoring diacritic characters when comparing words with special characters (é, è, …)

前端 未结 8 1481
梦如初夏
梦如初夏 2021-02-05 18:08

I have a list with some Belgian cities with diacritic characters: (Liège, Quiévrain, Franière, etc.) and I would like to transform these special characters to compare with a lis

8条回答
  •  一生所求
    2021-02-05 18:30

    The Collator class is a good way to do it (see corresponding javadoc). Here is a unit test that shows how to use it :

    import static org.junit.Assert.assertEquals;
    
    import java.text.Collator;
    import java.util.Locale;
    
    import org.junit.Test;
    
    public class CollatorTest {
        @Test public void liege() throws Exception {
            Collator compareOperator = Collator.getInstance(Locale.FRENCH);
            compareOperator.setStrength(Collator.PRIMARY);
    
            assertEquals(0, compareOperator.compare("Liege", "Liege")); // no accent
            assertEquals(0, compareOperator.compare("Liège", "Liege")); // with accent
            assertEquals(0, compareOperator.compare("LIEGE", "Liege")); // case insensitive
            assertEquals(0, compareOperator.compare("LIEGE", "Liège")); // case insensitive with accent
    
            assertEquals(1, compareOperator.compare("Liege", "Bruxelles"));
            assertEquals(-1, compareOperator.compare("Bruxelles", "Liege"));
        }
    }
    

    EDIT : sorry to see my answer did not meet your needs ; maybe it's beause I've presented it as unit test ? Is this ok for you ? I personnaly find it better because it's short and it uses the SDK (no need for String replacement)

    Collator compareOperator = Collator.getInstance(Locale.FRENCH);
    compareOperator.setStrength(Collator.PRIMARY);
    if (compareOperator.compare("Liège", "Liege") == 0) {
        // if we are here, then it's the "same" String
    }
    

    hope this helps

提交回复
热议问题