How to get the absolute node path in XML using XPath and Ruby?

谁说胖子不能爱 提交于 2019-12-12 23:33:08

问题


Basically I want to extract the absolute path from a node to root and report it to the console or a file. Below is the current solution:

require "rexml/document"

include REXML

def get_path(xml_doc, key)
  XPath.each(xml_doc, key) do |node|
    puts "\"#{node}\""
    XPath.each(node, '(ancestor::#node)') do |el|
      #  puts  el
    end
  end
end

test_doc = Document.new <<EOF
  <root>
   <level1 key="1" value="B">
     <level2 key="12" value="B" />
     <level2 key="13" value="B" />
   </level1>
  </root>
EOF

get_path test_doc, "//*/[@key='12']"

The issue is that it gives me "<level2 value='B' key='12'/>" as output. Desired output is <root><level1><level2 value='B' key='12'/> (format could be different, the main goal is to have a full path). I have only basic knowledge of XPath and would appreciate any help/guidance where to look and how to achieve this.


回答1:


If you're set on REXML, here's a REXML solution:

require 'rexml/document'

test_doc = REXML::Document.new <<EOF
  <root>
    <level1 key="1" value="B">
      <level2 key="12" value="B" />
      <level2 key="13" value="B" />
    </level1>
  </root>
EOF

def get_path(xml_doc, key)
  node = REXML::XPath.first( xml_doc, key )
  path = []
  while node.parent
    path << node
    node = node.parent
  end
  path.reverse
end

path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |el| el.name }.join("/")
#=> "root/level1/level2"

Or, if you want to use the same get_path implementation from the other answer, you can monkeypatch REXML to add an ancestors method:

class REXML::Child
  def ancestors
    ancestors = []

    # Presumably you don't want the node included in its list of ancestors
    # If you do, change the following line to    node = self
    node = self.parent

    # Presumably you want to stop at the root node, and not its owning document
    # If you want the document included in the ancestors, change the following
    # line to just    while node
    while node.parent
      ancestors << node
      node = node.parent
    end

    ancestors.reverse
  end
end



回答2:


This should get you started:

require 'nokogiri'

test_doc = Nokogiri::XML <<EOF
  <root>
   <level1 key="1" value="B">
     <level2 key="12" value="B" />
     <level2 key="13" value="B" />
   </level1>
  </root>
EOF

node = test_doc.at('//level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>

Nokogiri is really nice because it lets you use CSS accessors instead of XPath, if you choose. CSS is more intuitive to some people, and can be cleaner than an equivalent XPath:

node = test_doc.at('level2')
puts [*node.ancestors.reverse, node][1..-1].map{ |n| "<#{ n.name }>" }
# >> <root>
# >> <level1>
# >> <level2>



回答3:


First, note that your document is not, I think, what you intended. I suspect that you didn't want <level1> to be self-closing, but to contain the <level2> elements as children.

Secondly, I prefer and advocate Nokogiri instead of REXML. It's nice that REXML comes with Ruby, but Nokogiri is faster and more convenient, IMHO. So:

require 'nokogiri'

test_doc = Nokogiri::XML <<EOF
  <root>
    <level1 key="1" value="B">
      <level2 key="12" value="B" />
      <level2 key="13" value="B" />
    </level1>
  </root>
EOF

def get_path(xml_doc, key)
  xml_doc.at_xpath(key).ancestors.reverse
end

path = get_path( test_doc, "//*[@key='12']" )
p path.map{ |node| node.name }.join( '/' )
#=> "document/root/level1"


来源:https://stackoverflow.com/questions/4559118/how-to-get-the-absolute-node-path-in-xml-using-xpath-and-ruby

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!