问题
I'm parsing a document.xml file using Nokogiri, extracted from .docx file and need to get values of attributes with names, like "w:val
".
This is a sample of the source XML:
<w:document>
<w:body>
<w:p w:rsidR="004D5F21" w:rsidRPr="00820E0B" w:rsidRDefault="00301D39" pcut:cut="true">
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
</w:body>
</w:document>
This is a sample of the code:
require 'nokogiri'
doc = Nokogiri::XML(File.open(path))
doc.search('//w:jc').each do |n|
puts n['//w:val']
end
There is nothing in the console, only empty lines. How can I get the values of the attributes?
回答1:
require 'nokogiri'
doc = Nokogiri::XML(File.open(path))
doc.xpath('//jc').each do |n|
puts n.attr('val')
end
Should work. Don't forget to look at the docs : http://nokogiri.org/tutorials/searching_a_xml_html_document.html#fn:1
回答2:
The document is missing its namespace declaration, and Nokogiri isn't happy with it. If you check the errors
method for your doc
, you'll see something like:
puts doc.errors Namespace prefix w on document is not defined Namespace prefix w on body is not defined Namespace prefix w for rsidR on p is not defined Namespace prefix w for rsidRPr on p is not defined Namespace prefix w for rsidRDefault on p is not defined Namespace prefix pcut for cut on p is not defined Namespace prefix w on p is not defined Namespace prefix w on pPr is not defined Namespace prefix w for val on jc is not defined Namespace prefix w on jc is not defined Opening and ending tag mismatch: p line 3 and body Opening and ending tag mismatch: body line 2 and document Premature end of data in tag document line 1
By using Nokogiri's CSS accessors, rather than XPath, you can step around namespace issues:
puts doc.at('jc')['val']
will output:
center
If you need to iterate over multiple jc
nodes, use search
or one of its aliases or act-alike methods, similar to what you did before.
回答3:
Show there:
require 'nokogiri'
doc = Nokogiri::XML(File.open(path))
doc.search('jc').each do |n|
puts n['val']
end
Also, yes, read this: http://nokogiri.org/tutorials/searching_a_xml_html_document.html#fn:1
来源:https://stackoverflow.com/questions/8535509/get-the-values-of-attributes-with-namespace-using-nokogiri