Regex to replace a repeating string pattern

我与影子孤独终老i 提交于 2019-12-22 16:13:10

问题


I need to replace a repeated pattern within a word with each basic construct unit. For example I have the string "TATATATA" and I want to replace it with "TA". Also I would probably replace more than 2 repetitions to avoid replacing normal words.

I am trying to do it in Java with replaceAll method.


回答1:


I think you want this (works for any length of the repeated string):

String result = source.replaceAll("(.+)\\1+", "$1")

Or alternatively, to prioritize shorter matches:

String result = source.replaceAll("(.+?)\\1+", "$1")

It matches first a group of letters, and then it again (using back-reference within the match pattern itself). I tried it and it seems to do the trick.


Example

String source = "HEY HEY duuuuuuude what'''s up? Trololololo yeye .0.0.0";

System.out.println(source.replaceAll("(.+?)\\1+", "$1"));

// HEY dude what's up? Trolo ye .0



回答2:


You had better use a Pattern here than .replaceAll(). For instance:

private static final Pattern PATTERN 
    = Pattern.compile("\\b([A-Z]{2,}?)\\1+\\b");

//...

final Matcher m = PATTERN.matcher(input);
ret = m.replaceAll("$1");

edit: example:

public static void main(final String... args)
{
    System.out.println("TATATA GHRGHRGHRGHR"
        .replaceAll("\\b([A-Za-z]{2,}?)\\1+\\b", "$1"));
}

This prints:

TA GHR



回答3:


Since you asked for a regex solution:

(\\w)(\\w)(\\1\\2){2,};

(\w)(\w): matches every pair of consecutive word characters ((.)(.) will catch every consecutive pair of characters of any type), storing them in capturing groups 1 and 2. (\\1\\2) matches anytime the characters in those groups are repeated again immediately afterward, and {2,} matches when it repeats two or more times ({2,10} would match when it repeats more than one but less than ten times).

String s = "hello TATATATA world";    
Pattern p = Pattern.compile("(\\w)(\\w)(\\1\\2){2,}");
Matcher m = p.matcher(s);
while (m.find()) System.out.println(m.group());
    //prints "TATATATA"


来源:https://stackoverflow.com/questions/24008657/regex-to-replace-a-repeating-string-pattern

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!