Splitting String in regex with - as one word

时光毁灭记忆、已成空白 提交于 2021-01-28 14:34:27

问题


I am trying to split a sentence with 32 chars in each group of regex. The sentence is split after the complete word if 32nd character is a letter in the word. When my input is a sentence which has "-" it splits that word too.

This is the regex I am using

(\b.{1,32}\b\W?)

Input string:

Half Bone-in Spiral int with dark Packd Smithfield Half Bone-in Spiral Ham with Glaze Pack

resulting groups:

  1. Half Bone-in Spiral int with
  2. dark Packd Smithfield Half Bone-
  3. in Spiral Ham with Glaze Pack

In above split "Bone-in" is one word but regex splits it considering separate words. How can I modify my regex to treat "-" as one word? In short, I want the split after Bone-in.

Thank You.


回答1:


You may use

(\b.{1,32}(?![\w-])\W?)

Details

  • \b - a word boundary
  • .{1,32} - 1 to 32 chars other than line break chars, as many as possible
  • (?![\w-]) - the char immediately to the left of the current location cannot be a word (letter, digit or _) or - char
  • \W? - an optional non-word char.

In Java, use the following method:

public static String[] splitIncludeDelimeter(String regex, String text){
    List<String> list = new LinkedList<>();
    Matcher matcher = Pattern.compile(regex).matcher(text);

    int now, old = 0;
    while(matcher.find()){
        now = matcher.end();
        list.add(text.substring(old, now));
        old = now;
    }

    if(list.size() == 0)
        return new String[]{text};

    //adding rest of a text as last element
    String finalElement = text.substring(old);
    list.add(finalElement);

    return list.toArray(new String[list.size()]);
}

Java example:

String s = "Half Bone-in Spiral int with dark Packd Smithfield Half Bone-in Spiral Ham with Glaze Pack";
String[] res = splitIncludeDelimeter("(\\b.{1,32}(?![\\w-])\\W?)", s);
System.out.println(Arrays.toString(res));
// => [Half Bone-in Spiral int with , dark Packd Smithfield Half , Bone-in Spiral Ham with Glaze , Pack, ]


来源:https://stackoverflow.com/questions/53601310/splitting-string-in-regex-with-as-one-word

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!