Truncating Strings by Bytes

后端 未结 13 1720
醉酒成梦
醉酒成梦 2021-02-06 04:21

I create the following for truncating a string in java to a new string with a given number of bytes.

        String truncatedValue = \"\";
        String curren         


        
相关标签:
13条回答
  • 2021-02-06 04:28

    I've improved upon Peter Lawrey's solution to accurately handle surrogate pairs. In addition, I optimized based on the fact that the maximum number of bytes per char in UTF-8 encoding is 3.

    public static String substring(String text, int maxBytes) {
        for (int i = 0, len = text.length(); (len - i) * 3 > maxBytes;) {
            int j = text.offsetByCodePoints(i, 1);
            if ((maxBytes -= text.substring(i, j).getBytes(StandardCharsets.UTF_8).length) < 0)  
                return text.substring(0, i);
            i = j;
        }
        return text;
    }
    
    0 讨论(0)
  • 2021-02-06 04:29

    Use the UTF-8 CharsetEncoder, and encode until the output ByteBuffer contains as many bytes as you are willing to take, by looking for CoderResult.OVERFLOW.

    0 讨论(0)
  • 2021-02-06 04:29
    String s = "FOOBAR";
    
    int limit = 3;
    s = new String(s.getBytes(), 0, limit);
    

    Result value of s:

    FOO
    
    0 讨论(0)
  • 2021-02-06 04:33

    Second Approach here works good http://www.jroller.com/holy/entry/truncating_utf_string_to_the

    0 讨论(0)
  • 2021-02-06 04:35

    This is my :

    private static final int FIELD_MAX = 2000;
    private static final Charset CHARSET =  Charset.forName("UTF-8"); 
    
    public String trancStatus(String status) {
    
        if (status != null && (status.getBytes(CHARSET).length > FIELD_MAX)) {
            int maxLength = FIELD_MAX;
    
            int left = 0, right = status.length();
            int index = 0, bytes = 0, sizeNextChar = 0;
    
            while (bytes != maxLength && (bytes > maxLength || (bytes + sizeNextChar < maxLength))) {
    
                index = left + (right - left) / 2;
    
                bytes = status.substring(0, index).getBytes(CHARSET).length;
                sizeNextChar = String.valueOf(status.charAt(index + 1)).getBytes(CHARSET).length;
    
                if (bytes < maxLength) {
                    left = index - 1;
                } else {
                    right = index + 1;
                }
            }
    
            return status.substring(0, index);
    
        } else {
            return status;
        }
    }
    
    0 讨论(0)
  • 2021-02-06 04:37

    you could convert the string to bytes and convert just those bytes back to a string.

    public static String substring(String text, int maxBytes) {
       StringBuilder ret = new StringBuilder();
       for(int i = 0;i < text.length(); i++) {
           // works out how many bytes a character takes, 
           // and removes these from the total allowed.
           if((maxBytes -= text.substring(i, i+1).getBytes().length) < 0) break;
           ret.append(text.charAt(i));
       }
       return ret.toString();
    }
    
    0 讨论(0)
提交回复
热议问题