What is an efficient way to replace many characters in a string?

前端 未结 9 1044
梦谈多话
梦谈多话 2020-12-28 14:31

String handling in Java is something I\'m trying to learn to do well. Currently I want to take in a string and replace any characters I find.

Here is my current inef

相关标签:
9条回答
  • 2020-12-28 14:42

    What i see being inefficient is that you are gonna check again characters that have already been replaced, which is useless.

    I would get the charArray of the String instance, iterate over it, and for each character spam a series of if-else like this:

    char[] array = word.toCharArray();
    for(int i=0; i<array.length; ++i){
        char currentChar = array[i];
        if(currentChar.equals('é'))
            array[i] = 'e';
        else if(currentChar.equals('ö'))
            array[i] = 'o';
        else if(//...
    }
    
    0 讨论(0)
  • 2020-12-28 14:43

    I doubt, that you can speed up the 'character replacement' at all really. As for the case of regular expression replacement, you may compile the regexs beforehand

    0 讨论(0)
  • 2020-12-28 14:45

    I just implemented this utility class that replaces a char or a group of chars of a String. It is equivalent to bash tr and perl tr///, aka, transliterate. I hope it helps someone!

    package your.package.name;
    
    /**
     * Utility class that replaces chars of a String, aka, transliterate.
     * 
     * It's equivalent to bash 'tr' and perl 'tr///'.
     *
     */
    public class ReplaceChars {
    
        public static String replace(String string, String from, String to) {
            return new String(replace(string.toCharArray(), from.toCharArray(), to.toCharArray()));
        }
    
        public static char[] replace(char[] chars, char[] from, char[] to) {
    
            char[] output = chars.clone();
            for (int i = 0; i < output.length; i++) {
                for (int j = 0; j < from.length; j++) {
                    if (output[i] == from[j]) {
                        output[i] = to[j];
                        break;
                    }
                }
            }
            return output;
        }
    
        /**
         * For tests!
         */
        public static void main(String[] args) {
    
            // Example from: https://en.wikipedia.org/wiki/Caesar_cipher
            String string = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG";
            String from = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
            String to = "XYZABCDEFGHIJKLMNOPQRSTUVW";
    
            System.out.println();
            System.out.println("Cesar cypher: " + string);
            System.out.println("Result:       " + ReplaceChars.replace(string, from, to));
        }
    }
    

    This is the output:

    Cesar cypher: THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
    Result:       QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD
    
    0 讨论(0)
  • 2020-12-28 14:47

    You could create a table of String[] which is Character.MAX_VALUE in length. (Including the mapping to lower case)

    As the replacements got more complex, the time to perform them would remain the same.

    private static final String[] REPLACEMENT = new String[Character.MAX_VALUE+1];
    static {
        for(int i=Character.MIN_VALUE;i<=Character.MAX_VALUE;i++)
            REPLACEMENT[i] = Character.toString(Character.toLowerCase((char) i));
        // substitute
        REPLACEMENT['á'] =  "a";
        // remove
        REPLACEMENT['-'] =  "";
        // expand
        REPLACEMENT['æ'] = "ae";
    }
    
    public String convertWord(String word) {
        StringBuilder sb = new StringBuilder(word.length());
        for(int i=0;i<word.length();i++)
            sb.append(REPLACEMENT[word.charAt(i)]);
        return sb.toString();
    } 
    
    0 讨论(0)
  • 2020-12-28 14:49

    My suggestion would be:

    • Convert the String to a char[] array
    • Run through the array, testing each character one by one (e.g. with a switch statement) and replacing it if needed
    • Convert the char[] array back to a String

    I think this is probably the fastest performance you will get in pure Java.

    EDIT: I notice you are doing some changes that change the length of the string. In this case, the same principle applies, however you need to keep two arrays and increment both a source index and a destination index separately. You might also need to resize the destination array if you run out of target space (i.e. reallocate a larger array and arraycopy the existing destination array into it)

    0 讨论(0)
  • 2020-12-28 14:52

    My implementation is based on look up table.

    public static String convertWord(String str) {
        char[] words = str.toCharArray();
        char[] find = {'á','é','ú','ý','ð','ó','ö','æ','þ','-','.',
                '/'};
        String[] replace = {"a","e","u","y","d","o","o","ae","th"};
        StringBuilder out = new StringBuilder(str.length());
        for (int i = 0; i < words.length; i++) {
            boolean matchFailed = true;
            for(int w = 0; w < find.length; w++) {
                if(words[i] == find[w]) {
                    if(w < replace.length) {
                        out.append(replace[w]);
                    }
                    matchFailed = false;
                    break;
                }
            }
            if(matchFailed) out.append(words[i]);
        }
        return out.toString();
    }
    
    0 讨论(0)
提交回复
热议问题