问题
I am writing a java application that is a document store. I create my objects and save them to disk with serialization.
I came across an error when I was loading my objects from disk, but I had actually changed my base object that it was serializing to.
This seems like a bad way to manage storing objects, if I updated my software with changes to my base object all of my objects on disk would be invalid.
Is there and guidance or best practice in dealing this issue? Or is there a better way for me to save my data?
回答1:
You’ll want to read the Java Object Serialization Specification, specifically the Compatible Java Type Evolution section and the section immediately following it, Type Changes Affecting Serialization.
Section 1.10 states:
For serializable objects, sufficient information is kept to restore those objects even if a different (but compatible) version of the implementation of the class is present.
As a developer, you are responsible to making sure that changes to your classes do not conflict with earlier serialized versions. It’s not as hard as you might think. Mostly, you need to avoid incompatible changes:
- Do not delete a field. If it is no longer used, deprecate it. (This includes making an instance field a
static
field; static fields are not serialized, so this is equivalent to removing it as far as serialization is concerned.) - Do not change a field’s type.
You can save additional data by adding a void writeObject(ObjectOutputStream)
method to your class, and you can perform additional initialization by adding a void readObject(ObjectInputStream)
method. These are described in detail in the documentation for Serializable. Note that the first line of code in those methods should be stream.defaultWriteObject()
and stream.defaultReadObject()
, respectively.
readObject
is important when you add fields to a class, if you want those fields to be initialized. For instance, if you have a new field which you always want to be non-null:
private List<String> names = new ArrayList<>();
Any older instance which was serialized without that field present will be deserialized with that field completely unset—that is, it will remain null (since all Object fields are null when an object is created, unless explicitly initialized). You can use readObject
to account for this:
private void readObject(ObjectInputStream stream)
throws IOException,
ClassNotFoundException {
// First, do default serialization
stream.defaultReadObject();
if (this.names == null) {
this.names = new ArrayList<>();
}
}
回答2:
Depending on your storage requirements your options are, but not limited to (hardest to easiest in my opinion):
Using an embedded database and store those files on disk instead. There are plenty of options to choose from. Derby, HSQL, H2, Sqlite and many others. The advantage then is that you can use a migration tool like Flyway. Whenever your schema changes, you write a script and make sure to run the tool when your application starts up. It's actually very simple, but not knowing all this stuff prior makes the learning curve a bit steeper. You have a lot of flexibility here too.
Serialize to JSON or XML. Again, plenty of tools to choose from but for generic JSON serialization I recommend Jackson. JSON can be uglified (put in one line with no spaces and all that) so that it uses less disk space. Very easy to work with but less flexibility. For example, you don't need to care about much when you only add fields to your classes but there will come a moment of changing the hierarchy or extracting smaller classes. Then you're on your own. You need to think of a way to migrate your data and that can be overwhelming.
Very similar to the previous option, you can use more disk efficient solutions like Avro. This has all the drawbacks that were already mentioned.
Anything you come up with on your own. I would not recommend doing this. Chances are, you are going to be happy with your solution for a while, be it because it's simpler, more understandable or whatever else but for a tiny while. You then have to support more code, worry about efficiency and performance (if you have to) and many more challenges you will inevitably encounter. Don't do it, go for a solution you will get help with.
回答3:
Probably you need to store a field "version" with your object.
When you serialize your object : put this field with a specific value, like "1" for the first version.
If you change the serialization format, increment the version number.
When you unserialize objects, read the version number, and unserialize with the expected format for this version. If you added a field since the saved version, handles this field correctly. If you removed a field, ignore it.
You need to keep the ability in your source code to unserialize objects from all previous versions.
You can also use some library to do that, like protobuf or gson which give you the ability to describe a format composed by some fields, with a specific version number and automatically handles expected fields.
回答4:
You can try to use XML, JDOM gives you easy tools to parse JAVA to XML and XML to JAVA
来源:https://stackoverflow.com/questions/39393648/what-is-the-best-practice-for-saving-and-retrieving-a-java-object