Parsing the multilevel XML File using java (DOM Parser)

做~自己de王妃 提交于 2019-12-08 09:39:43

问题


Here is example of my XML file :

    ?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="xslt/options.xsl"?>
    <options>
      <version>0001</version>
      <title>ConfigData</title>
      <category>
        <name>GConfigData</name>
        <option>
          <name>String_name</name>
          <value>350.16.01a</value>
          <control>
            <type>TextBox2</type>
            <caption> String Name</caption>
            <left>0</left>
            <top>0</top>
            <width>2600</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>FileID</name>
          <value>1601</value>
          <control>
            <type>TextBox2</type>
            <caption>file version</caption>
            <left>0</left>
            <top>900</top>
            <width>2600</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>systemID</name>
          <value>0</value>
          <control>
            <type>TextBox2</type>
            <caption>System ID</caption>
            <left>0</left>
            <top>1800</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>SyncTime</name>
          <value>2</value>
          <control>
            <type>TextBox2</type>
            <caption>Sync Time</caption>
            <left>0</left>
            <top>2700</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>UseServer</name>
          <value>0</value>
          <control>
            <type>TextBox2</type>
            <caption>Use Server</caption>
            <left>0</left>
            <top>3600</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>CommType</name>
          <value>0</value>
          <control>
            <type>FixedList</type>
            <caption>Comm Type</caption>
            <left>0</left>
            <top>4500</top>
            <width>2400</width>
            <height>900</height>
            <list>                                              
              <item>
                <text>Parellel</text>
                <value>0</value>
              </item>
              <item>
                <text>Simple Serial</text>
                <value>1</value>
              </item>
              <item>
                <text>Complex Serial</text>
                <value>2</value>
              </item>
            </list>
          </control>
        </option>
        <option>
          <name>YYBasis</name>
          <value>70</value>
          <control>
            <type>TextBox2</type>
            <caption>Set YY Basis</caption>
            <left>0</left>
            <top>5400</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>Separator</name>
          <value>46</value>
          <control>
            <type>TextBox2</type>
            <caption>Separator</caption>
            <left>0</left>
            <top>6300</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>WholeSeparator</name>
          <value>44</value>
          <control>
            <type>TextBox2</type>
            <caption>Whole Separator</caption>
            <left>0</left>
            <top>7200</top>
            <width>2400</width>
            <height>900</height>
            <font>Courier</font>
            <scroll_bar>0</scroll_bar>
          </control>
        </option>
        <option>
          <name>DateFormat</name>
          <value>0</value>
          <control>
            <type>FixedList</type>
            <caption>Date Format</caption>
            <left>2600</left>
            <top>0</top>
            <width>2400</width>
            <height>900</height>
            <list>
              <item>
                <text>MM/DD/YY</text>
                <value>0</value>
              </item>
              <item>
                <text>MM/DD/YYYY</text>
                <value>1</value>
              </item>
              <item>
                <text>DD/MM/YY</text>
                <value>2</value>
              </item>
              <item>
                <text>DD/MM/YYYY</text>
                <value>3</value>
              </item>
              <item>
                <text>YY/MM/DD</text>
                <value>4</value>
              </item>
              <item>
                <text>MM.DD.YY</text>
                <value>6</value>
              </item>
              <item>
                <text>MM.DD.YYYY</text>
                <value>7</value>
              </item>
              <item>
                <text>DD.MM.YY</text>
                <value>8</value>
              </item>
              <item>
                <text>DD.MM.YYYY</text>
                <value>9</value>
              </item>
              <item>
                <text>YY.MM.DD</text>
                <value>10</value>
              </item>
              <item>
                <text>YYYY.MM.DD</text>
                <value>11</value>
              </item>
            </list>
          </control>
        </option>
      </category>
    </options>

I wrote the java code to parse the name , caption and value of each option. Here is code :

public class XMLParsingSingleFileFinal {



    public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException
       {
          //Get Document Builder
          DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
          DocumentBuilder builder = factory.newDocumentBuilder();

          //Build Document
          Document document = builder.parse(new File("options.xml"));

          //Normalize the XML Structure; It's just too important !!
          document.getDocumentElement().normalize();
          XPath xPath =  XPathFactory.newInstance().newXPath();

          //Here comes the root node
          Element root = document.getDocumentElement();
          System.out.println(root.getNodeName());

          //Get all options
          NodeList nList = document.getElementsByTagName("options");
          System.out.println("Total Options = " + nList.getLength());
          System.out.println("TITLE = " + document.getElementsByTagName("title").item(0).getTextContent());
          System.out.println("VERSION = " + document.getElementsByTagName("version").item(0).getTextContent());

          System.out.println("===================================");

          //Get all category
          NodeList nList1 = document.getElementsByTagName("category");
          System.out.println("Total Category inside options = " + nList1.getLength());
          //int count1 = nList1.getLength();


          for (int temp = 0; temp < nList1.getLength(); temp++)
          {
             Node node = nList1.item(temp);
             if (node.getNodeType() == Node.ELEMENT_NODE)
             {
                 Element mElement = (Element) node;
                 System.out.println("\nCategory Name = " + mElement.getElementsByTagName("name").item(0).getTextContent());
                 NodeList nList2 = mElement.getElementsByTagName("option");
                 System.out.println("option inside category = " + nList2.getLength());
                 System.out.println("\n\t");
                // int count = nList2.getLength();


                 for (int temp1 = 0; temp1 < nList2.getLength()/2; temp1++) 
                {

                    Node nNode = nList2.item(temp1);
                    if (nNode.getNodeType() == Node.ELEMENT_NODE)
                    {

                    Element nElement = (Element) nNode;

                 System.out.println("\tOption Name = " + mElement.getElementsByTagName("name").item(temp1+1).getTextContent());
                 System.out.println("\t\tCaption Name = " + mElement.getElementsByTagName("caption").item(temp1).getTextContent());

                 System.out.println("\t\tValue = " + mElement.getElementsByTagName("value").item(temp1).getTextContent());



                 System.out.println("\n\t");

            }

              }  
                 System.out.println("\n\t");
             }   
          }   
       }
}

My main aim is to parse the "value" of the node "option".

As you can see that in the "option" - commtype , there is attribute "item" which also have childnode "value".

So while parsing , Till the Option name "commtype" it is producing the correct data. Moving on to next option its taking the "value" of childnode "item" from previous option.

Example:(Parse Result)

options
Total Options = 1
TITLE = ConfigData
VERSION = 0001
===================================
Total Category inside options = 23

Category Name = GConfigData
option inside category = 38


    Option Name = String_name
        Caption Name = String Name
        Value = 350.16.01a


    Option Name = FileID
        Caption Name =  file version
        Value = 1601


    Option Name = SystemID
        Caption Name = System ID
        Value = 0


    Option Name = SyncTime
        Caption Name = Sync Time
        Value = 2


    Option Name = UseServer
        Caption Name = Use Server
        Value = 0


    Option Name = CommType
        Caption Name = Comm Type
        Value = 0


    Option Name = YYBasis
        Caption Name = Set YY Basis
        Value = 0        /*(Here the value should be 70 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:parellel)/value )*/


    Option Name = Separator
        Caption Name =  Separator
        Value = 1       /*(Here the value should be 46 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:simple serial)/value)*/


    Option Name =WholeSeparator
        Caption Name = Whole Separator
        Value = 2     /*(Here the value should be 44 as in XML file , But its taking the value of option(Name:CommType)/control/list/item(text:complex serial)/value)*/


    Option Name = DateFormat
        Caption Name = Date Format
        Value = 70    //(Value should be 0)

After the Option Name: CommType , the value of each option is parsed wrongly.

What can be the solution of this? I am new to java as well as XML.

PS: This is my first question on this forum.I apologize of any spelling mistake and if the way of questioning is wrong. Please try to help me in possible ways.


回答1:


Do not use indexes\offset for nodes (hardcoding anti-pattern), it makes your code not agile

SAXReader reader = new SAXReader();
Document document = reader.read(file);
List<Node> nodes = document.selectNodes("/options/category/option");

for (Node node : nodes) {
    System.out.println("caption: " + node.selectSingleNode("control/caption").getText());
    System.out.println("value : " + node.selectSingleNode("value").getText());
}

example output (cutted):

caption:  String Name
value : 350.16.01a
caption: file version
value : 1601
caption: System ID
value : 0

dependencies required:

<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.1.6</version>
</dependency>

<dependency>
    <groupId>dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>1.6.1</version>
</dependency>



回答2:


The method node.getElementsByTagName () searches for all occurences within node. Because you always use the Category node as search base instead of the option or the item node, you will get unexpected results.



来源:https://stackoverflow.com/questions/36003863/parsing-the-multilevel-xml-file-using-java-dom-parser

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!