Java String encoding (UTF-8)

前端 未结 2 1995
猫巷女王i
猫巷女王i 2020-12-05 10:46

I have come across this line of legacy code, which I am trying to figure out:

String newString = new String(oldString.getBytes(\"UTF-8\"), \"UTF-8\"));


        
相关标签:
2条回答
  • 2020-12-05 11:47

    How is this different from the following?

    This line of code here:

    String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));
    

    constructs a new String object (i.e. a copy of oldString), while this line of code:

    String newString = oldString;
    

    declares a new variable of type java.lang.String and initializes it to refer to the same String object as the variable oldString.

    Is there any scenario in which the two lines will have different outputs?

    Absolutely:

    String newString = oldString;
    boolean isSameInstance = newString == oldString; // isSameInstance == true
    

    vs.

    String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));
     // isSameInstance == false (in most cases)    
    boolean isSameInstance = newString == oldString;
    

    a_horse_with_no_name (see comment) is right of course. The equivalent of

    String newString = new String(oldString.getBytes("UTF-8"), "UTF-8"));
    

    is

    String newString = new String(oldString);
    

    minus the subtle difference wrt the encoding that Peter Lawrey explains in his answer.

    0 讨论(0)
  • 2020-12-05 11:53

    This could be complicated way of doing

    String newString = new String(oldString);
    

    This shortens the String is the underlying char[] used is much longer.

    However more specifically it will be checking that every character can be UTF-8 encoded.

    There are some "characters" you can have in a String which cannot be encoded and these would be turned into ?

    Any character between \uD800 and \uDFFF cannot be encoded and will be turned into '?'

    String oldString = "\uD800";
    String newString = new String(oldString.getBytes("UTF-8"), "UTF-8");
    System.out.println(newString.equals(oldString));
    

    prints

    false
    
    0 讨论(0)
提交回复
热议问题