问题
I have inherited an application that parses xml using dom4j and xPath:
The xml being parsed is similar to the following:
<cache>
<content>
<transaction>
<page>
<widget name="PAGE_ID">WRK_REGISTRATION</widget>
<widget name="TRANS_DETAIL_ID">77145</widget>
<widget name="GRD_ERRORS" />
</page>
<page>
<widget name="PAGE_ID">WRK_REGISTRATION</widget>
<widget name="TRANS_DETAIL_ID">77147</widget>
<widget name="GRD_ERRORS" />
</page>
<page>
<widget name="PAGE_ID">WRK_PROCESSING</widget>
<widget name="TRANS_DETAIL_ID">77152</widget>
<widget name="GRD_ERRORS" />
</page>
</transaction>
</content>
</cache>
Individual Nodes are being searched using the following:
String xPathToGridErrorNode = "//cache/content/transaction/page/widget[@name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']/../widget[@name='TRANS_DETAIL_ID'][text()='77147']/../widget[@name='GRD_ERRORS_TEMP']";
org.dom4j.Element root = null;
SAXReader reader = new SAXReader();
Document document = reader.read(new BufferedInputStream(new ByteArrayInputStream(xmlToParse.getBytes())));
root = document.getRootElement();
Node gridNode = root.selectSingleNode(xPathToGridErrorNode);
where xmlToParse is a String of xml similar to the excerpt provided above.
The code is trying to obtain the GRD_ERROR node for the page with the PAGE_ID and TRANS_DETAIL_ID provided in the xPath.
I am seeing an intermittent (~1-2%) failure (returned node is null) of this selectSingleNode request even though the requested node is in the xml being searched.
I know there are some gotchas associated with using text()= in xPath and was wondering if there was a better way to format the xPath string for this type of search.
回答1:
From your snippets, there is a problem regarding GRD_ERRORS
vs. GRD_ERRORS_TMP
and WRK_REGISTRATION
vs. WRK_DNA_REGISTRATION
.
Ignoring that, I would suggest to rewrite
//cache/content/transaction/page
/widget[@name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']
/../widget[@name='TRANS_DETAIL_ID'][text()='77147']
/../widget[@name='GRD_ERRORS_TEMP']
as
//cache/content/transaction/page
[widget[@name='PAGE_ID'][text()='WRK_REGISTRATION']]
[widget[@name='TRANS_DETAIL_ID'][text()='77147']]
/widget[@name='GRD_ERRORS']
Just because it makes the code, in my eyes, easier to read, and expresses what you seem to mean more clearly: “the page
element that has children with these conditions, and then take the widget with this @name
.” Or, if that is closer to how you think about it,
//cache/content/transaction/page/widget[@name='GRD_ERRORS']
[preceding-sibling::widget[@name='PAGE_ID'][text()='WRK_REGISTRATION']]
[preceding-sibling::widget[@name='TRANS_DETAIL_ID'][text()='77147']]
来源:https://stackoverflow.com/questions/9914486/use-of-text-function-when-using-xpath-in-dom4j