Replacing multiple substrings in Java when replacement text overlaps search text

前端 未结 5 709
囚心锁ツ
囚心锁ツ 2021-01-12 20:54

Say you have the following string:

cat dog fish dog fish cat

You want to replace all cats with dogs, all do

相关标签:
5条回答
  • 2021-01-12 21:24

    Here's a method to do it without regex.

    I noticed that every time a part of the string a gets replaced with b, b will always be part of the final string. So, you can ignore b from the string from then on.

    Not only that, after replacing a with b, there will be a "space" left there. No replacement can take place across where b is supposed to be.

    These actions add up to look a lot like split. split up the values (making the "space" in between strings), do further replacements for each string in the array, then joins them back.

    For example:

    // Original
    "cat dog fish dog fish cat"
    
    // Replace cat with dog
    {"", "dog fish dog fish", ""}.join("dog")
    
    // Replace dog with fish
    {
        "",
        {"", " fish ", " fish"}.join("fish")
        ""
    }.join("dog")
    
    // Replace fish with cat
    {
        "",
        {
            "",
            {" ", " "}.join("cat"),
            {" ", ""}.join("cat")
        }.join("fish")
        ""
    }.join("dog")
    

    So far the most intuitive way (to me) is to do this is recursively:

    public static String replaceWithJointMap(String s, Map<String, String> map) {
        // Base case
        if (map.size() == 0) {
            return s;
        }
    
        // Get some value in the map to replace
        Map.Entry pair = map.entrySet().iterator().next();
        String replaceFrom = (String) pair.getKey();
        String replaceTo = (String) pair.getValue();
    
        // Split the current string with the replaceFrom string
        // Use split with -1 so that trailing empty strings are included
        String[] splitString = s.split(Pattern.quote(replaceFrom), -1);
    
        // Apply replacements for each of the strings in the splitString
        HashMap<String, String> replacementsLeft = new HashMap<>(map);
        replacementsLeft.remove(replaceFrom);
    
        for (int i=0; i<splitString.length; i++) {
            splitString[i] = replaceWithJointMap(splitString[i], replacementsLeft);
        }
    
        // Join back with the current replacements
        return String.join(replaceTo, splitString);
    }
    

    I don't think this is very efficient though.

    0 讨论(0)
  • 2021-01-12 21:30

    I would create a StringBuilder and then parse the text once, one word at a time, transferring over unchanged words or changed words as I go. I wouldn't parse it for each swap as you're suggesting.

    So rather than doing something like:

    // pseudocode
    text is new text swapping cat with dog
    text is new text swapping dog with fish
    text is new text swapping fish with cat
    

    I'd do

    for each word in text
       if word is cat, swap with dog
       if word is dog, swap with fish
       if word is fish, swap with cat
       transfer new word (or unchanged word) into StringBuilder.
    

    I'd probably make a swap(...) method for this and use a HashMap for the swap.

    For example

    import java.util.HashMap;
    import java.util.Map;
    import java.util.Scanner;
    
    public class SwapWords {
       private static Map<String, String> myMap = new HashMap<String, String>();
    
       public static void main(String[] args) {
          // this would really be loaded using a file such as a text file or xml
          // or even a database:
          myMap.put("cat", "dog");
          myMap.put("dog", "fish");
          myMap.put("fish", "dog");
    
          String testString = "cat dog fish dog fish cat";
    
          StringBuilder sb = new StringBuilder();
          Scanner testScanner = new Scanner(testString);
          while (testScanner.hasNext()) {
             String text = testScanner.next();
             text = myMap.get(text) == null ? text : myMap.get(text);
             sb.append(text + " ");
          }
    
          System.out.println(sb.toString().trim());
       }
    }
    
    0 讨论(0)
  • 2021-01-12 21:42

    It seems StringUtils.replaceEach in apache commons does what you want:

    StringUtils.replaceEach("abcdeab", new String[]{"ab", "cd"}, new String[]{"cd", "ab"});
    // returns "cdabecd"
    

    Note that the documenent at the above links seems to be in error. See comments below for details.

    0 讨论(0)
  • 2021-01-12 21:42
    String rep = str.replace("cat","§1§").replace("dog","§2§")
                    .replace("fish","§3§").replace("§1§","dog")
                    .replace("§2§","fish").replace("§3§","cat");
    

    Ugly and inefficient as hell, but works.


    OK, here's a more elaborate and generic version. I prefer using a regular expression rather than a scanner. That way I can replace arbitrary Strings, not just words (which can be better or worse). Anyway, here goes:

    public static String replace(
        final String input, final Map<String, String> replacements) {
    
        if (input == null || "".equals(input) || replacements == null 
            || replacements.isEmpty()) {
            return input;
        }
        StringBuilder regexBuilder = new StringBuilder();
        Iterator<String> it = replacements.keySet().iterator();
        regexBuilder.append(Pattern.quote(it.next()));
        while (it.hasNext()) {
            regexBuilder.append('|').append(Pattern.quote(it.next()));
        }
        Matcher matcher = Pattern.compile(regexBuilder.toString()).matcher(input);
        StringBuffer out = new StringBuffer(input.length() + (input.length() / 10));
        while (matcher.find()) {
            matcher.appendReplacement(out, replacements.get(matcher.group()));
        }
        matcher.appendTail(out);
        return out.toString();
    }
    

    Test Code:

    System.out.println(replace("cat dog fish dog fish cat",
        ImmutableMap.of("cat", "dog", "dog", "fish", "fish", "cat")));
    

    Output:

    dog fish cat fish cat dog

    Obviously this solution only makes sense for many replacements, otherwise it's a huge overkill.

    0 讨论(0)
  • 2021-01-12 21:43
    public class myreplase {
        public Map<String, String> replase;
    
        public myreplase() {
            replase = new HashMap<String, String>();
    
            replase.put("a", "Apple");
            replase.put("b", "Banana");
            replase.put("c", "Cantalope");
            replase.put("d", "Date");
            String word = "a b c d a b c d";
    
            String ss = "";
            Iterator<String> i = replase.keySet().iterator();
            while (i.hasNext()) {
                ss += i.next();
                if (i.hasNext()) {
                    ss += "|";
                }
            }
    
            Pattern pattern = Pattern.compile(ss);
            StringBuilder buffer = new StringBuilder();
            for (int j = 0, k = 1; j < word.length(); j++,k++) {
                String s = word.substring(j, k);
                Matcher matcher = pattern.matcher(s);
                if (matcher.find()) {
                    buffer.append(replase.get(s));
                } else {
                    buffer.append(s);
                }
            }
            System.out.println(buffer.toString());
        }
    
        public static void main(String[] args) {
            new myreplase();
        }
    }
    

    Output :- Apple Banana Cantalope Date Apple Banana Cantalope Date

    0 讨论(0)
提交回复
热议问题