java string split on all non-alphanumeric except apostrophes

后端 未结 2 1215
盖世英雄少女心
盖世英雄少女心 2020-11-30 10:02

So I want to split a string in java on any non-alphanumeric characters.

Currently I have been doing it like this

words= Str.split(\"\\\\W+\");


        
相关标签:
2条回答
  • 2020-11-30 10:23

    For basic English characters, use

    words = Str.split("[^a-zA-Z0-9']+");
    

    If you want to include English words with special characters (such as fiancé) or for languages that use non-English characters, go with

    words = Str.split("[^\\p{L}0-9']+");
    
    0 讨论(0)
  • 2020-11-30 10:37
    words = Str.split("[^\\w']+");
    

    Just add it to the character class. \W is equivalent to [^\w], which you can then add ' to.

    Do note, however, that \w also actually includes underscores. If you want to split on underscores as well, you should be using [^a-zA-Z0-9'] instead.

    0 讨论(0)
提交回复
热议问题