What is the best way to fully read a stream of objects from a file in Java?

后端 未结 4 1360
星月不相逢
星月不相逢 2021-01-14 17:33

I\'m creating a potentially long log of objects and do not want to keep them all in memory before writing to a file, so I can\'t write a serialized collection of the objects

相关标签:
4条回答
  • 2021-01-14 17:58

    Write a boolean after each object, with the "last" object being followed by a false. So, in your stream that you write out:

    true
    <object>
    true
    <object>
    true
    <object>
    false
    

    Then, when reading them back in, you check the flag (you know there will always be one after each object) to decide whether or not to read another one.

    boolean will be stored very compactly in a serialization stream, so it shouldn't add much to the file size.

    0 讨论(0)
  • 2021-01-14 18:00

    Your code is incorrect. readObject() doesn't return null at EOS, it throws EOFException. So catch it. Null is returned if you wrote a null. You don't need all the booleans or marker objects suggested above.

    0 讨论(0)
  • 2021-01-14 18:15

    I'm creating a potentially long log of objects and do not want to keep them all in memory before writing to a file, so I can't write a serialized collection of the objects to a file

    This requirement is not met when using Java serialization, because the serialization stream maintains strong references to the objects previously written, presumably in order to write back references should these objects need to be serialized again. This can be verified by running:

    public static void main(String[] args) throws Exception {
        OutputStream os = new FileOutputStream("C:\\test");
        ObjectOutputStream oos = new ObjectOutputStream(os);
        for (Integer i = 0; i < 1E9; i++) {
            oos.writeObject(i);
        }
        oos.close();
    }
    

    A similar problem exists when deserializing the file. To resolve back references, the stream is very likely to keep all previously read objects alive to resolve potential back references to these objects from the serialization stream.

    If you really need to be able to release these objects before the stream is fully written you might wish to use a fresh ObjectOutputStream for each (batch of) objects ObjectOutputStream.reset() - of course losing the capability to resolve back references from earlier streams. That is, the following program will not throw an OutOfMemoryError:

    public static void main(String[] args) throws Exception {
        OutputStream os = new FileOutputStream("C:\\test");
        ObjectOutputStream oos = new ObjectOutputStream(os);
        for (Integer i = 0; i < 1E9; i++) {
            oos.writeObject(i);
            oos.reset();
        }
        oos.close();
    }
    

    Note that the metadata about the classes being serialized will be written anew after each reset, which is quite wasteful (the above program write about 80 bytes per Integer ...), so you should not reset too often, perhaps once every 100 objects?

    For detecting the end of stream, I find bozho's suggestion of an EOF object best.

    0 讨论(0)
  • 2021-01-14 18:19

    Your solution seems fine. Just make sure you have a finally clause, where you close your stream.

    Alternatively, you can create an EOF object of yours, and add it at the end. Thus you can check if the currently read object is the EofObject, and break at that point.

    0 讨论(0)
提交回复
热议问题