问题
Basically I want to extract the absolute path from a node to root and report it to the console or a file. Below is the current solution:
require "rexml/document"
include REXML
def get_path(xml_doc, key)
XPath.each(xml_doc, key) do |node|
puts "\"#{node}\""
XPath.each(node, '(ancestor::#node)') do |el|
# puts el
end
end
end
test_doc = Document.new <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
get_path test_doc, "//*/[@key='12']"
The issue is that it gives me "<level2 value='B' key='12'/>"
as output. Desired output is <root><level1><level2 value='B' key='12'/>
(format could be different, the main goal is to have a full path). I have only basic knowledge of XPath and would appreciate any help/guidance where to look and how to achieve this.
回答1:
If you're set on REXML, here's a REXML solution:
require 'rexml/document'
test_doc = REXML::Document.new <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
def get_path(xml_doc, key)
node = REXML::XPath.first( xml_doc, key )
path = []
while node.parent
path << node
node = node.parent
end
path.reverse
end
path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |el| el.name }.join("/")
#=> "root/level1/level2"
Or, if you want to use the same get_path
implementation from the other answer, you can monkeypatch REXML to add an ancestors
method:
class REXML::Child
def ancestors
ancestors = []
# Presumably you don't want the node included in its list of ancestors
# If you do, change the following line to node = self
node = self.parent
# Presumably you want to stop at the root node, and not its owning document
# If you want the document included in the ancestors, change the following
# line to just while node
while node.parent
ancestors << node
node = node.parent
end
ancestors.reverse
end
end
回答2:
This should get you started:
require 'nokogiri'
test_doc = Nokogiri::XML <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
node = test_doc.at('//level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>
Nokogiri is really nice because it lets you use CSS accessors instead of XPath, if you choose. CSS is more intuitive to some people, and can be cleaner than an equivalent XPath:
node = test_doc.at('level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>
回答3:
First, note that your document is not, I think, what you intended. I suspect that you didn't want <level1>
to be self-closing, but to contain the <level2>
elements as children.
Secondly, I prefer and advocate Nokogiri instead of REXML. It's nice that REXML comes with Ruby, but Nokogiri is faster and more convenient, IMHO. So:
require 'nokogiri'
test_doc = Nokogiri::XML <<EOF
<root>
<level1 key="1" value="B">
<level2 key="12" value="B" />
<level2 key="13" value="B" />
</level1>
</root>
EOF
def get_path(xml_doc, key)
xml_doc.at_xpath(key).ancestors.reverse
end
path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |node| node.name }.join( '/' )
#=> "document/root/level1"
来源:https://stackoverflow.com/questions/4559118/how-to-get-the-absolute-node-path-in-xml-using-xpath-and-ruby