There\'s an XML stream which I need to parse. Since I only need to do it once and build my java objects, SAX looks like the natural choice. I\'m extending DefaultHandler and
I do something very similar, but instead of having boolean
flags to tell me what state I'm in, I test for player
or team
being non-null
. Makes things a bit neater. This requires you to set them to null
when you detect the end of each element, after you've added it to the relevant list.
If you need prettier code please use StAX, this comparison of all XML parsing APIs suggests that StAX is a much better option.
StAX performance in most tests is better than that of any other API implementation too.
So I personally don't see any reason to go on with SAX unless you're doing some legacy related programming.
There is one neat trick when writing a SAX parser: It is allowed to change the
ContentHandler
of a XMLReader while parsing. This allows to separate the
parsing logic for different elements into multiple classes, which makes the
parsing more modular and reusable. When one handler sees its end element it
switches back to its parent. How many handlers you implement would be left to
you. The code would look like this:
public class RootHandler extends DefaultHandler {
private XMLReader reader;
private List<Team> teams;
public RootHandler(XMLReader reader) {
this.reader = reader;
this.teams = new LinkedList<Team>();
}
public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
if (name.equals("team")) {
// Switch handler to parse the team element
reader.setContentHandler(new TeamHandler(reader, this));
}
}
}
public class TeamHandler extends DefaultHandler {
private XMLReader reader;
private RootHandler parent;
private Team team;
private StringBuilder content;
public TeamHandler(XMLReader reader, RootHandler parent) {
this.reader = reader;
this.parent = parent;
this.content = new StringBuilder();
this.team = new Team();
}
// characters can be called multiple times per element so aggregate the content in a StringBuilder
public void characters(char[] ch, int start, int length) throws SAXException {
content.append(ch, start, length);
}
public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {
content.setLength(0);
}
public void endElement(String uri, String localName, String name) throws SAXException {
if (name.equals("name")) {
team.setName(content.toString());
} else if (name.equals("team")) {
parent.addTeam(team);
// Switch handler back to our parent
reader.setContentHandler(parent);
}
}
}
I strongly recommend to stop parsing yourself, and grab good XML data-binding library. XStream (http://x-stream.github.io/) is may personal favorite, but there many different libraries. It may be even able to parse your POJOs on the spot, without any configuration required (if you use property names and pluralisation to match the XML structure).
It's difficult to advise without knowing more about your requirements, but the fact that you are surprised that "my code got quite complex" suggests that you were not well informed when you chose SAX. SAX is a low-level programming interface capable of very high performance, but that's because the parser is doing far less work for you, and you therefore need to do a lot more work yourself.