Use of text() function when using xPath in dom4j

故事扮演 提交于 2019-12-11 19:53:10

问题


I have inherited an application that parses xml using dom4j and xPath:

The xml being parsed is similar to the following:

<cache>
  <content>
    <transaction>
      <page>
        <widget name="PAGE_ID">WRK_REGISTRATION</widget>
        <widget name="TRANS_DETAIL_ID">77145</widget>
        <widget name="GRD_ERRORS" />
      </page>
      <page>
        <widget name="PAGE_ID">WRK_REGISTRATION</widget>
        <widget name="TRANS_DETAIL_ID">77147</widget>
        <widget name="GRD_ERRORS" />
      </page>
      <page>
        <widget name="PAGE_ID">WRK_PROCESSING</widget>
        <widget name="TRANS_DETAIL_ID">77152</widget>
        <widget name="GRD_ERRORS" />
      </page>
    </transaction>
  </content>
</cache>

Individual Nodes are being searched using the following:

String xPathToGridErrorNode = "//cache/content/transaction/page/widget[@name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']/../widget[@name='TRANS_DETAIL_ID'][text()='77147']/../widget[@name='GRD_ERRORS_TEMP']";

org.dom4j.Element root = null;

SAXReader reader = new SAXReader();
Document document = reader.read(new BufferedInputStream(new ByteArrayInputStream(xmlToParse.getBytes())));
root = document.getRootElement();

Node gridNode = root.selectSingleNode(xPathToGridErrorNode);

where xmlToParse is a String of xml similar to the excerpt provided above.

The code is trying to obtain the GRD_ERROR node for the page with the PAGE_ID and TRANS_DETAIL_ID provided in the xPath.

I am seeing an intermittent (~1-2%) failure (returned node is null) of this selectSingleNode request even though the requested node is in the xml being searched.

I know there are some gotchas associated with using text()= in xPath and was wondering if there was a better way to format the xPath string for this type of search.


回答1:


From your snippets, there is a problem regarding GRD_ERRORS vs. GRD_ERRORS_TMP and WRK_REGISTRATION vs. WRK_DNA_REGISTRATION.

Ignoring that, I would suggest to rewrite

//cache/content/transaction/page
  /widget[@name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']
  /../widget[@name='TRANS_DETAIL_ID'][text()='77147']
  /../widget[@name='GRD_ERRORS_TEMP']

as

//cache/content/transaction/page
  [widget[@name='PAGE_ID'][text()='WRK_REGISTRATION']]
  [widget[@name='TRANS_DETAIL_ID'][text()='77147']]
  /widget[@name='GRD_ERRORS']

Just because it makes the code, in my eyes, easier to read, and expresses what you seem to mean more clearly: “the page element that has children with these conditions, and then take the widget with this @name.” Or, if that is closer to how you think about it,

//cache/content/transaction/page/widget[@name='GRD_ERRORS']
  [preceding-sibling::widget[@name='PAGE_ID'][text()='WRK_REGISTRATION']]
  [preceding-sibling::widget[@name='TRANS_DETAIL_ID'][text()='77147']]


来源:https://stackoverflow.com/questions/9914486/use-of-text-function-when-using-xpath-in-dom4j

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!