Truncating Strings by Bytes

后端 未结 13 1740
醉酒成梦
醉酒成梦 2021-02-06 04:21

I create the following for truncating a string in java to a new string with a given number of bytes.

        String truncatedValue = \"\";
        String curren         


        
13条回答
  •  春和景丽
    2021-02-06 04:44

    As noted, Peter Lawrey solution has major performance disadvantage (~3,500msc for 10,000 times), Rex Kerr was much better (~500msc for 10,000 times) but the result not was accurate - it cut much more than it needed (instead of remaining 4000 bytes it remainds 3500 for some example). attached here my solution (~250msc for 10,000 times) assuming that UTF-8 max length char in bytes is 4 (thanks WikiPedia):

    public static String cutWord (String word, int dbLimit) throws UnsupportedEncodingException{
        double MAX_UTF8_CHAR_LENGTH = 4.0;
        if(word.length()>dbLimit){
            word = word.substring(0, dbLimit);
        }
        if(word.length() > dbLimit/MAX_UTF8_CHAR_LENGTH){
            int residual=word.getBytes("UTF-8").length-dbLimit;
            if(residual>0){
                int tempResidual = residual,start, end = word.length();
                while(tempResidual > 0){
                    start = end-((int) Math.ceil((double)tempResidual/MAX_UTF8_CHAR_LENGTH));
                    tempResidual = tempResidual - word.substring(start,end).getBytes("UTF-8").length;
                    end=start;
                }
                word = word.substring(0, end);
            }
        }
        return word;
    }
    

提交回复
热议问题