问题
I am trying to compare 2 byte[]
which are the results of serialization of the same object:
- 1
byte[]
is created by serializing the object - the other by deserializing the 1st
byte[]
and then serializing it again.
I do not understand how these 2 arrays can be different. Deserializing the first byte[]
should reconstruct the original object, and serializing that object is the same as serializing the original one. So, the 2 byte[]
should be the same. However, under certain circumstances they can be different, apparently.
The object I am serializing (State
) holds a list of another object (MapWrapper
) which in turn holds a single collection. Depending on the collection, I get different results from my comparison code.
Here is the MCVE:
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class Test {
public static void main(String[] args) {
State state = new State();
state.maps.add(new MapWrapper());
byte[] pBA = stateToByteArray(state);
State pC = byteArrayToState(pBA);
byte[] zero = stateToByteArray(pC);
System.out.println(Arrays.equals(pBA, zero)); // see output below
State pC2 = byteArrayToState(pBA);
byte[] zero2 = stateToByteArray(pC2);
System.out.println(Arrays.equals(zero2, zero)); // always true
}
public static byte[] stateToByteArray(State s) {
try {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(bos);
oos.writeObject(s);
return bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
public static State byteArrayToState(byte[] bytes) {
ObjectInputStream ois;
try {
ois = new ObjectInputStream(new ByteArrayInputStream(bytes));
return (State) ois.readObject();
} catch (IOException | ClassNotFoundException e) {
e.printStackTrace();
}
return null;
}
}
class State implements Serializable {
private static final long serialVersionUID = 1L;
List<MapWrapper> maps = new ArrayList<>();
}
class MapWrapper implements Serializable {
private static final long serialVersionUID = 1L;
// Different options, choose one!
// List<Integer> ints = new ArrayList<>(); true
// List<Integer> ints = new ArrayList<>(3); true
// Map<String, Integer> map = new HashMap<>(); true
// Map<String, Integer> map = new HashMap<>(2); false
}
For some reason, if MapWrapper
contains a HashMap
(or LinkedHashMap
) and is initialized with an initial capacity, the serialization gives a different result than a serialization-deserialization-serialization.
I added a 2nd iteration of deserialization-serialization and compared to the 1st. They are always equal. The difference manifests only after the first iteration.
Note that I must create a MapWrapper
and add it to the list in State
, as done in the start of main
, to cause this.
As much as I know, the initial capacity is a performance parameter only. Using the default one or a specified one should not change behavior or functionality.
I am using jdk1.8.0_25 and Windows7.
Why does this happen?
回答1:
The following line and comment in the HashMap source code of readObject explains the difference:
s.readInt(); // Read and ignore number of buckets
Indeed, looking at the hex of the bytes, the difference is between a number 2 (your configured number of buckets) and a number 16 (the default number of buckets). I haven't checked that's what this particular byte means; but it'd be quite a coincidence if it's something else, considering that's the only difference.
<snip> 08 00 00 00 02 00 00 00 00 78 78 // Original
<snip> 08 00 00 00 10 00 00 00 00 78 78 // Deserialized+serialized.
^
来源:https://stackoverflow.com/questions/38635375/why-does-specifying-maps-initial-capacity-cause-subsequent-serializations-to-gi