Suggestion to parse this XML in Java

后端 未结 3 1090
长发绾君心
长发绾君心 2021-01-25 11:00

Not new to Java; but relatively new to XML-parsing. I know a tiny bit about a lot of the XML tools out there, but not much about any of them. I am also not an XML-pro.

相关标签:
3条回答
  • 2021-01-25 11:49

    If your XML documents are relatively small (as appears to be the case here), I would use the DOM framework and XPath class. Here is some boilerplate DOM/XPath code from one of my tutorials:

    File xmlFile = ...
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document doc = db.parse(xmlFile);
    
    XPath xp = XPathFactory.newInstance().newXPath();
    String value = xp.evaluate("/path/to/element/text()", doc);
    // .. reuse xp to get other values as required
    

    In other words, basically you:

    • get your XML into a Document object, via a DocumentBuilder;

    • create an XPath object;

    • repeatedly call XPath.evaluate(), passing in the path of the element(s) required and your Document.

    As you see, there's a little bit of fiddliness in getting hold of your Document object and like all good XML APIs, it throws a plethora of silly pointless checked exceptions. But apart from that, it's fairly no-nonsense for parsing simple small to medium XML documents whose structure is relatively fixed.

    0 讨论(0)
  • 2021-01-25 11:51

    OK, so I settled on a solution that (to me) seemed to address my needs in the most reasonable way. My apologies to the other suggestions, but I just liked this route better because it kept most of the parsing-rules as annotations and what little procedural-code I had to write was very minimal.

    I ended up going with JAXB; initially I thought JAXB would either create XML from a Java-class or parse XML into a Java-class but only with an XSD. Then I discovered that JAXB has annotations that can parse XML into a Java-class without an XSD.

    The XML-file I'm working with is huge and very deep, but I only need bits and bites of it here and there; I was worried that navigating what maps to where in the future would be very difficult. So I chose to structure a tree of folders modeled after the XML... each folder maps to an element and in each folder is a POJO representing that actual element.

    Problem is, sometimes there is an element who has a child-element several levels down which has a single property I care about. It would be a pain to create 4 nested-folders and a POJO for each just to get access to a single property. But that's how you do it with JAXB (at least, from what I can tell); once again I was in a corner.

    Then I stumbled on EclipseLink's JAXB-implementation: Moxy. Moxy has an @XPath annotation that I could place in that parent POJO and use to navigate several levels down to get access to a single property without creating all those folders and element-POJOs. Nice.

    So I created something like this: (note: I chose to use getters for cases where I need to massage the value)

    // maps to the root-"xml" element in the file
    @XmlRootElement( name="xml" )
    @XmlAccessorType( XmlAccessType.FIELD )
    public class Xml {
    
        // this is standard JAXB
        @XmlElement;               
        private Item item;
        public Item getItem() {    
            return this.item;
        }
    
        ...
    }
    
    // maps to the "<xml><item>"-element in the file
    public class Item {
    
        // standard JAXB; maps to "<xml><item id="...">"
        @XmlAttribute              
        private String id;
        public String getId() {
            return this.id;
        }
    
        // getting an attribute buried deep down
        // MOXY; maps to "<xml><item><rating average="...">"
        @XmlPath( "rating/@average" )    
        private Double averageRating;
        public Double getAverageRating() {
            return this.average;
        }
    
        // getting a list buried deep down
        // MOXY; maps to "<xml><item><service><identification><aliases><alias.../><alias.../>"
        @XmlPath( "service/identification/aliases/alias/text()" )
        private List<String> aliases;
        public List<String> getAliases() {
            return this.aliases;
        }
    
        // using a getter to massage the value
        @XmlElement(name="dateforindex")
        private String dateForIndex;
        public Date getDateForIndex() {
            // logic to parse the string-value into a Date
        }
    
    }
    

    Also note that I took the route of separating the XML-object from the model-object I actually use in the app. Thus, I have a factory that transforms these crude objects into much more robust objects which I actually use in my app.

    0 讨论(0)
  • 2021-01-25 11:57

    You can use SAXParser or STAXParser. If you can afford some more amount of memory, then you can also afford to use DOMParser. I would advise STAXParser would be best for you.

    0 讨论(0)
提交回复
热议问题