Best approach to serialize XML to stream with Java?

不想你离开。 提交于 2019-12-25 03:22:09

问题


We serialize/deserialize XML using XStream... and just got an OutOfMemory exception.

Firstly I don't understand why we're getting the error as we have 500MB allocated to the server.

Question is - what changes should we make to stay out of trouble? We want to ensure this implementation scales.

Currently we have ~60K objects, each ~50 bytes. We load the 60K POJO's in memory, and serialize them to a String which we send to a web service using HttpClient. When receiving, we get the entire String, then convert to POJO's. The XML/object hierarchy is like:

<root>
    <meta>
       <date>10/10/2009</date>
       <type>abc</type>
    </meta>

    <data>
        <field>x</field>
    </data>

    [thousands of <data>]
</root>

I gather the best approach is to not store the POJO's in memory and not write the contents to a single String. Instead we should write the individual <data> POJO's to a stream. XStream supports this but seems like the <meta> element wouldn't be supported. Data would need to be in form:

<root> 
    <data>
        <field>x</field>
    </data>

    [thousands of <data>]
</root>

So what approach is easiest to stream the entire tree?


回答1:


You definitely want to avoid serializing your POJOs into a humongous String and then writing that String out. Use the XStream APIs to serialize the POJOs directly to your OutputStream. I ran into the same situation earlier this year when I found that I was generating 200-300Mb XML documents and getting OutOfMemoryErrors. It was very easy to make the switch.

And ditto of course for the reading side. Don't read the XML into a String and ask XStream to deserialize from that String: deserialize directly from the InputStream.

You mention a second issue regarding not being able to serialize the <meta> element and the <data> elements. I don't think this is an XStream problem or limitation as I routinely serialize much more complex structures on the order of:

<myobject>
    <item>foo</item>
    <anotheritem>foo</anotheritem>
    <alist>
        <alistitem>
            <value1>v1</value1>
            <value2>v2</value2>
            <value3>v3</value3>
            ...
        </alistitem>
        ...
        <alistitem>
            <value1>v1</value1>
            <value2>v2</value2>
            <value3>v3</value3>
            ...
        </alistitem>
    </alist>
    <anotherlist>
        <anotherlistitem>
            <valA>A</valA>
            <valB>B</valB>
            <valC>C</valC>
            ...
        </anotherlistitem>
        ...
    </anotherlist>
</myobject>

I've successfully serialized and deserialized nested lists too.




回答2:


Not sure what the problem is here...you've found your answer on that webpage.

The example code on the link you provided suggests:

Writer someWriter = new FileWriter("filename.xml");

ObjectOutputStream out = xstream.createObjectOutputStream(someWriter, "root");
out.writeObject(dataObject);
// iterate over your objects...
out.close();

and for reading nearly identical but with Reader for Writer and Input for Output:

Reader someReader = new FileReader("filename.xml");

ObjectInputStream in = xstream.createObjectInputStream(someReader);
DataObject foo = (DataObject)in.readObject();
// do some stuff here while there's more objects...
in.close();



回答3:


I'd suggest using tools like Visual VM or Eclipse Memory Analyzer to make sure you don't have a memory leak/problem.

Also, how do you know each object is 50 bytes? That doesn't sound likely.




回答4:


Use XMLStreamWriter (or XStream) to serialize it, you can write whatever you want on it. If you have the option of getting the input stream instead of the entire string, use a SAXParser, it is event based and, although the implementation maybe a little bit clumsy, you will be able to read any XML that is thrown at you, even if it the XML is huge (I have parse 2GB+ more XML files with SAXParser).

Just as a side note, you should send the binary data and not the string to a XML parser. XML parsers will read the encoding of the byte array that is going to come next through the xml tag in the beginning of the XML sequence:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

A string is encoded in something already. It's better practice to let the XML parse the original stream before you create a String with it.



来源:https://stackoverflow.com/questions/1911773/best-approach-to-serialize-xml-to-stream-with-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!