问题
Hello fellow developers... just to make sure, I want to ask this question:
How does XML SAX parser access the .xml file it is parsing? Does it download the whole file from the given URL?
Is there any use in breaking the parsing so that we can save some kilobytes of data?
Imagine a large .xml file with ordered items. We need only several items from the top, the other items may already be processed and stored. When I stop the parsing at specific point, will I save some data (surely I will save some time).
Thanks for answers.
回答1:
SAX parser implementations exist in many languages, and the answer may be implementation-specific. But at least the common Java implementations can read the xml from a stream and needn't download the entire thing.
An invocation of a Java SAX parser to parse from a URL
usually looks something like
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
MyHandler handler = new MyHandler();
xr.setContentHandler(handler);
xr.parse(new InputSource(sourceUrl.openStream()));
where the handler MyHandler
is a class you define implementing org.xml.sax.ContentHandler
(most easily by extending org.xml.sax.helpers.DefaultHandler
) and sourceURL
is a java.net.URL
for the URL.
Of course all this has to be enclosed in a try-catch...
Your handler can throw an exception signaling that it has reached the end of what you want to parse, and by catching this exception, your program can cleanly finish without reading the entire stream.
来源:https://stackoverflow.com/questions/9176082/sax-parser-and-a-file-from-the-nework