Split string to equal length substrings in Java

后端 未结 21 1865
日久生厌
日久生厌 2020-11-22 02:56

How to split the string \"Thequickbrownfoxjumps\" to substrings of equal size in Java. Eg. \"Thequickbrownfoxjumps\" of 4 equal size should give th

21条回答
  •  伪装坚强ぢ
    2020-11-22 03:20

    I asked @Alan Moore in a comment to the accepted solution how strings with newlines could be handled. He suggested using DOTALL.

    Using his suggestion I created a small sample of how that works:

    public void regexDotAllExample() throws UnsupportedEncodingException {
        final String input = "The\nquick\nbrown\r\nfox\rjumps";
        final String regex = "(?<=\\G.{4})";
    
        Pattern splitByLengthPattern;
        String[] split;
    
        splitByLengthPattern = Pattern.compile(regex);
        split = splitByLengthPattern.split(input);
        System.out.println("---- Without DOTALL ----");
        for (int i = 0; i < split.length; i++) {
            byte[] s = split[i].getBytes("utf-8");
            System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);
        }
        /* Output is a single entry longer than the desired split size:
        ---- Without DOTALL ----
        [Idx: 0, length: 26] - [B@17cdc4a5
         */
    
    
        //DOTALL suggested in Alan Moores comment on SO: https://stackoverflow.com/a/3761521/1237974
        splitByLengthPattern = Pattern.compile(regex, Pattern.DOTALL);
        split = splitByLengthPattern.split(input);
        System.out.println("---- With DOTALL ----");
        for (int i = 0; i < split.length; i++) {
            byte[] s = split[i].getBytes("utf-8");
            System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);
        }
        /* Output is as desired 7 entries with each entry having a max length of 4:
        ---- With DOTALL ----
        [Idx: 0, length: 4] - [B@77b22abc
        [Idx: 1, length: 4] - [B@5213da08
        [Idx: 2, length: 4] - [B@154f6d51
        [Idx: 3, length: 4] - [B@1191ebc5
        [Idx: 4, length: 4] - [B@30ddb86
        [Idx: 5, length: 4] - [B@2c73bfb
        [Idx: 6, length: 2] - [B@6632dd29
         */
    
    }
    

    But I like @Jon Skeets solution in https://stackoverflow.com/a/3760193/1237974 also. For maintainability in larger projects where not everyone are equally experienced in Regular expressions I would probably use Jons solution.

提交回复
热议问题