问题
I need to find attribute values based on other values pulled from parent's/grand-parent's sibling's children. I think it's going to take 2 different expressions.
So given the following XML (which is derived from a log file that can be thousands of lines long):
<p:log xmlns:p="urn:NamespaceInfo">
<p:entries>
<p:entry timestamp="2012-12-31T09:39:25">
<p:attributes>
<p:attrib name="Position" value="1B2" />
<p:attrib name="Something" value="Something_else" />
</p:attributes>
<p:msg>
</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:attributes>
<p:attrib name="Form" value="FormA" />
</p:attributes>
<p:msg>
</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:msg>Successful....</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T12:12:12">
<p:attributes>
<p:attrib name="Position" value="1B3" />
<p:attrib name="Something" value="Something_else" />
</p:attributes>
<p:msg>
</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:attributes>
<p:attrib name="Form" value="FormB" />
</p:attributes>
<p:msg>
</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:msg>Processing....</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:msg>Error1</p:msg>
</p:entry>
<p:entry timestamp="2012-12-31T09:39:25">
<p:msg>Error1</p:msg>
</p:entry>
</p:entries>
...
</p:log>
- (
<p:attributes>
parent tags can have multiple<p:attrib>
child tags) - (
<p:event>
tags can only have one<p:msg>
tag)
First, I need to grab the value of the value
attribute that has a corresponding name
attribute of Position
, but only if the grand-parent's sibling p:entry
has a child p:msg
with the text of Error1
. Also, it needs to stay within that section. For instance, I don't want the first occurrence of the Position'/'Value
pair because a new Position
/Value
pair appears before the Error1
, even though technically the p:msg
with the Error1
is a sibling of both grand-parents.
Next, I need the timestamp attributes' value of the parent of the child whose Position
/Value
I just grabbed. So, find the position, then find the timestamp attribute value of the grand-parent p:entry
tag.
So for this example, I should be able to retrieve the following values only:
1B3
2012-12-31T12:12:12
(the date/time stamps given are arbitrary values. This one is different so you know which one I was referencing).
Kind of confusing I know. I will also need to make sure I grab just one instance because I am using XQuery to get the data out of a database and each expression has to result to a singular value.
I can get to the first timestamp associated with the p:msg
with Error1
with the following: //p:entry[descendant::p:msg='Error1.'][1]/@timestamp
but can't seem to get back up the tree to get the other values.
I can get the all of timestamps of the p:events that have p:attrib grand-children with: //p:entry[descendant::p:attrib[@name=''Position'']]/@timestamp)[1]
but I can't seem to limit it to just the one that has the 'Error1' following it. I can't base my selection on position. I have to base it first on content.
BONUS QUESTION
How could I do this again on the next instance down the log file?
(not just the second Error1
message, the next time down the log file where the Error1
msg shows up for the next 'parent/sibling' match). This may be obvious once I get the answer to the questions above.
回答1:
UPDATED:
OK I think I got this. Here's the answer to the first one:
//p:msg[text()="Error1"]/../preceding-sibling::p:entry[./*/p:attrib[@name="Position"]][1]/*/p:attrib[@name="Position"]/@value
This is working back from the p:msg
tag, which makes it easier to select the first (that's the [1]
in there) of the preceding parent p:entry
tags which satisfy the condition that they have a grandchild p:attrib
with a name Position
.
Getting the timestamp is just a tad simpler:
//p:msg[text()="Error1"]/../preceding-sibling::p:entry[./*/p:attrib[@name="Position"]][1]/@timestamp
Try that out and see what you think.
ORIGINAL ANSWER:
Normally I don't post half-finished answers, but my guess is that you won't get anything else since this question is so complicated, so here's the xpath for what you describe in the first paragraph:
//p:entry[following-sibling::p:entry/p:msg/text()="Error1"]/*/p:attrib[@name="Position"]/@value
This will get
the value of the value attribute that has a corresponding name attribute of Position, but only if the grand-parent's sibling p:entry has a child p:msg with the text of Error1.
However I don't know what you mean when you say "it needs to stay within that section". Can you clarify? This will return both 1B2
and 1B3
.
For the second part of your question, you can get the timestamp for the entries above with this:
//p:entry[following-sibling::p:entry/p:msg/text()="Error1" and ./*/p:attrib[@name="Position"]]/@timestamp
Again though, this won't do the "section" thing you mentioned. That's a bit more tricky, beyond my (current) knowledge of xpath unfortunately.
来源:https://stackoverflow.com/questions/12589059/need-a-complex-xpath-using-sibling-children-ancestors