How to trim a java stringbuilder?

前端 未结 8 2033
南方客
南方客 2021-01-11 11:18

I have a StringBuilder object that needs to be trimmed (i.e. all whitespace chars /u0020 and below removed from either end).

I can\'t seem to find a method in string

相关标签:
8条回答
  • 2021-01-11 11:26

    Don't worry about having two strings. It's a microoptimization.

    If you really have detected a bottleneck, you can have a nearly-constant-time trimming - just iterate the first N chars, until they are Character.isWhitespace(c)

    0 讨论(0)
  • 2021-01-11 11:29

    only one of you have taken into account that when you convert the String builder to a "string" and then "trim" that you create an immutable object twice that has to be garbage collected, so the total allocation is:

    1. Stringbuilder object
    2. immutable string of the SB object 1 immutable object of the string that has been trimmed.

    So whilst it may "appear" that the trim is faster, in the real world and with a loaded memory scheme it will in fact be worse.

    0 讨论(0)
  • 2021-01-11 11:35

    I made some code. It works and the test cases are there for you to see. Let me know if this is okay.

    Main code -

    public static StringBuilder trimStringBuilderSpaces(StringBuilder sb) {
    
        int len = sb.length();
    
        if (len > 0) {
    
                int start = 0;
                int end = 1;
                char space = ' ';
                int i = 0;
    
                // Remove spaces at start
                for (i = 0; i < len; i++) {
                    if (sb.charAt(i) != space) {
                        break;
                    }
                }
    
                end = i;
                //System.out.println("s = " + start + ", e = " + end);
                sb.delete(start, end);
    
                // Remove the ending spaces
                len = sb.length();
    
                if (len > 1) {
    
                    for (i = len - 1; i > 0; i--) {
                        if (sb.charAt(i) != space) {
                            i = i + 1;
                            break;
                        }
                    }
    
                    start = i;
                    end = len;// or len + any positive number !
    
                    //System.out.println("s = " + start + ", e = " + end);
                    sb.delete(start, end);
    
                }
    
        }
    
        return sb;
    }
    

    The full code with test -

    package source;
    
    import java.io.PrintWriter;
    import java.io.StringWriter;
    import java.util.ArrayList;
    
    public class StringBuilderTrim {
    
        public static void main(String[] args) {
            testCode();
        }
    
        public static void testCode() {
    
            StringBuilder s1 = new StringBuilder("");
            StringBuilder s2 = new StringBuilder(" ");
            StringBuilder s3 = new StringBuilder("  ");
            StringBuilder s4 = new StringBuilder(" 123");
            StringBuilder s5 = new StringBuilder("  123");
            StringBuilder s6 = new StringBuilder("1");
            StringBuilder s7 = new StringBuilder("123 ");
            StringBuilder s8 = new StringBuilder("123  ");
            StringBuilder s9 = new StringBuilder(" 123 ");
            StringBuilder s10 = new StringBuilder("  123  ");
    
            /*
             * Using a rough form of TDD here. Initially, one one test input
             * "test case" was added and rest were commented. Write no code for the
             * method being tested. So, the test will fail. Write just enough code
             * to make it pass. Then, enable the next test. Repeat !!!
             */
            ArrayList<StringBuilder> ins = new ArrayList<StringBuilder>();
            ins.add(s1);
            ins.add(s2);
            ins.add(s3);
            ins.add(s4);
            ins.add(s5);
            ins.add(s6);
            ins.add(s7);
            ins.add(s8);
            ins.add(s9);
            ins.add(s10);
    
            // Run test
            for (StringBuilder sb : ins) {
                System.out
                        .println("\n\n---------------------------------------------");
                String expected = sb.toString().trim();
                String result = trimStringBuilderSpaces(sb).toString();
                System.out.println("In [" + sb + "]" + ", Expected [" + expected
                        + "]" + ", Out [" + result + "]");
                if (result.equals(expected)) {
                    System.out.println("Success!");
                } else {
                    System.out.println("FAILED!");
                }
                System.out.println("---------------------------------------------");
            }
    
        }
    
        public static StringBuilder trimStringBuilderSpaces(StringBuilder inputSb) {
    
            StringBuilder sb = new StringBuilder(inputSb);
            int len = sb.length();
    
            if (len > 0) {
    
                try {
    
                    int start = 0;
                    int end = 1;
                    char space = ' ';
                    int i = 0;
    
                    // Remove spaces at start
                    for (i = 0; i < len; i++) {
                        if (sb.charAt(i) != space) {
                            break;
                        }
                    }
    
                    end = i;
                    //System.out.println("s = " + start + ", e = " + end);
                    sb.delete(start, end);
    
                    // Remove the ending spaces
                    len = sb.length();
    
                    if (len > 1) {
    
                        for (i = len - 1; i > 0; i--) {
                            if (sb.charAt(i) != space) {
                                i = i + 1;
                                break;
                            }
                        }
    
                        start = i;
                        end = len;// or len + any positive number !
    
                        //System.out.println("s = " + start + ", e = " + end);
                        sb.delete(start, end);
    
                    }
    
                } catch (Exception ex) {
    
                    StringWriter sw = new StringWriter();
                    PrintWriter pw = new PrintWriter(sw);
                    ex.printStackTrace(pw);
                    sw.toString(); // stack trace as a string
    
                    sb = new StringBuilder("\nNo Out due to error:\n" + "\n" + sw);
                    return sb;
                }
    
            }
    
            return sb;
        }
    }
    
    0 讨论(0)
  • 2021-01-11 11:37

    I had exactly your question at first, however, after 5-minute's second thought, I realized actually you never need to trim the StringBuffer! You only need to trim the string you append into the StringBuffer.

    If you want to trim an initial StringBuffer, you can do this:

    StringBuffer sb = new StringBuffer(initialStr.trim());
    

    If you want to trim StringBuffer on-the-fly, you can do this during append:

    Sb.append(addOnStr.trim());
    
    0 讨论(0)
  • 2021-01-11 11:40
    strBuilder.replace(0,strBuilder.length(),strBuilder.toString().trim());
    
    0 讨论(0)
  • 2021-01-11 11:41

    You should not use the deleteCharAt approach.

    As Boris pointed out, the deleteCharAt method copies the array over every time. The code in the Java 5 that does this looks like this:

    public AbstractStringBuilder deleteCharAt(int index) {
        if ((index < 0) || (index >= count))
            throw new StringIndexOutOfBoundsException(index);
        System.arraycopy(value, index+1, value, index, count-index-1);
        count--;
        return this;
    }
    

    Of course, speculation alone is not enough to choose one method of optimization over another, so I decided to time the 3 approaches in this thread: the original, the delete approach, and the substring approach.

    Here is the code I tested for the orignal:

    public static String trimOriginal(StringBuilder sb) {
        return sb.toString().trim();
    }
    

    The delete approach:

    public static String trimDelete(StringBuilder sb) {
        while (sb.length() > 0 && Character.isWhitespace(sb.charAt(0))) {
            sb.deleteCharAt(0);
        }
        while (sb.length() > 0 && Character.isWhitespace(sb.charAt(sb.length() - 1))) {
            sb.deleteCharAt(sb.length() - 1);
        }
        return sb.toString();
    }
    

    And the substring approach:

    public static String trimSubstring(StringBuilder sb) {
        int first, last;
    
        for (first=0; first<sb.length(); first++)
            if (!Character.isWhitespace(sb.charAt(first)))
                break;
    
        for (last=sb.length(); last>first; last--)
            if (!Character.isWhitespace(sb.charAt(last-1)))
                break;
    
        return sb.substring(first, last);
    }
    

    I performed 100 tests, each time generating a million-character StringBuffer with ten thousand trailing and leading spaces. The testing itself is very basic, but it gives a good idea of how long the methods take.

    Here is the code to time the 3 approaches:

    public static void main(String[] args) {
    
        long originalTime = 0;
        long deleteTime = 0;
        long substringTime = 0;
    
        for (int i=0; i<100; i++) {
    
            StringBuilder sb1 = new StringBuilder();
            StringBuilder sb2 = new StringBuilder();
            StringBuilder sb3 = new StringBuilder();
    
            for (int j=0; j<10000; j++) {
                sb1.append(" ");
                sb2.append(" ");
                sb3.append(" ");
            }
            for (int j=0; j<980000; j++) {
                sb1.append("a");
                sb2.append("a");
                sb3.append("a");
            }
            for (int j=0; j<10000; j++) {
                sb1.append(" ");
                sb2.append(" ");
                sb3.append(" ");
            }
    
            long timer1 = System.currentTimeMillis();
            trimOriginal(sb1);
            originalTime += System.currentTimeMillis() - timer1;
    
            long timer2 = System.currentTimeMillis();
            trimDelete(sb2);
            deleteTime += System.currentTimeMillis() - timer2;
    
            long timer3 = System.currentTimeMillis();
            trimSubstring(sb3);
            substringTime += System.currentTimeMillis() - timer3;
        }
    
        System.out.println("original:  " + originalTime + " ms");
        System.out.println("delete:    " + deleteTime + " ms");
        System.out.println("substring: " + substringTime + " ms");
    }
    

    I got the following output:

    original:  176 ms
    delete:    179242 ms
    substring: 154 ms
    

    As we see, the substring approach provides a very slight optimization over the original "two String" approach. However, the delete approach is extremely slow and should be avoided.

    So to answer your question: you are fine trimming your StringBuilder the way you suggested in the question. The very slight optimization that the substring method offers probably does not justify the excess code.

    0 讨论(0)
提交回复
热议问题