Java Object Serialization Performance tips

后端 未结 9 1207
别那么骄傲
别那么骄傲 2021-02-01 11:15

I must serialize a huge tree of objects (7,000) into disk. Originally we kept this tree in a database with Kodo, but it would make thousands upon thousands of Queries to load t

相关标签:
9条回答
  • 2021-02-01 11:32

    You can use Colfer to generate the beans and Java's standard serialization performance will get a 10 - 1000x boost. Unless the size reaches over a GB chances are you'll be well below a second.

    0 讨论(0)
  • 2021-02-01 11:33

    One optimization is customizing the class descriptors, so that you store the class descriptors in a different database and in the object stream you only refer to them by ID. This reduces the space needed by the serialized data. See for example how in one project the classes SerialUtil and ClassesTable do it.

    Making classes Externalizable instead of Serializable can give some performance benefits. The downside is that it requires lots of manual work.

    Then there are other serialization libraries, for example jserial, which can give better performance than Java's default serialization. Also, if the object graph does not include cycles, then it can be serialized a little bit faster, because the serializer does not need to keep track of objects it has seen (see "How does it work?" in jserial's FAQ).

    0 讨论(0)
  • 2021-02-01 11:37

    Don't forget to use the 'transient' key word for instance variables that don't have to be serialized. This gives you a performance boost because you are no longer reading/writing unnecessary data.

    0 讨论(0)
  • 2021-02-01 11:40

    This is how I would do it, form the top of my head

    Serialization

    1. Serialize each object individually
    2. Assign each object a unique key
    3. When an object holds a reference to another object, put the unique key for that object in the objects place in the serialization. (I would use an UUID converted to binary)
    4. Save each object into a file/database/storage using the unique key

    Unserialization

    1. Start form an arbitrary object (usually the root i suspect) unserialize it and put it in a map with it's unique key as index and return it
    2. When you step on an object key in the serialization stream, first check if it's already unserialized by looking up it's unique key in the map and if it is just grab it from there, if not put a lazy loading proxy (which repeats these two steps for that object) instead of the real object which has hooks to load the right object when you need it.

    Edit, you might need to use two-pass serialization and unserialization if you have circular references in there, it complicates things a bit - but not that much.

    0 讨论(0)
  • 2021-02-01 11:51

    For performance, I'd suggest not using java.io serialisation at all. Instead get down on to the bytes yourself.

    If you are going to java.io serialise the tree you might need to make sure your recursion doesn't get too deep, either by flattening (as say TreeSet does) or arranging to serialise the deepest nodes first (so you have back references rather than nested readObject calls).

    I would be surprised if there wasn't a way in Kodo to read the entire tree in in one (or a few) goes.

    0 讨论(0)
  • 2021-02-01 11:53

    I would recomend you to implement custom writeObject() and readObject() methods. In this way you will able eleminate writting chidren nodes for each node in a tree. When you use default serialization, each node will be serialized with all it's children.

    For example, writeObject() of a Tree class should iterate through the all nodes of a tree and only write nodes data (without Nodes itself) with some markers, which identifies tree level.

    You can look at LinkedList, to see how this methods implemented there. It uses the same approach in order to prevent writting prev and next entries for each single entry.

    0 讨论(0)
提交回复
热议问题