trying to get content inside cdata tags in xml file using nokogiri

后端 未结 2 667
猫巷女王i
猫巷女王i 2020-12-20 11:50

I have seen several things on this, but nothing has seemed to work so far. I am parsing an xml via a url using nokogiri on rails 3 ruby 1.9.2.

A snippet of the xm

相关标签:
2条回答
  • 2020-12-20 12:35

    Ah I see. What @mu said is correct. But to get at the cdata directly, maybe:

    xml =<<EOF
    <NewsLineText>
      <![CDATA[
      Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly     creme brulee.
      ]]>
    </NewsLineText>
    EOF
    node = Nokogiri::XML xml
    cdata = node.search('NewsLineText').children.find{|e| e.cdata?}
    
    0 讨论(0)
  • 2020-12-20 12:40

    You're trying to parse XML using Nokogiri's HMTL parser. If node as from the XML parser then r would be nil since XML is case sensitive; your r is not nil so you're using the HTML parser which is case insensitive.

    Use Nokogiri's XML parser and you will get things like this:

    >> r = doc.at_xpath('.//NewsLineText')
    => #<Nokogiri::XML::Element:0x8066ad34 name="NewsLineText" children=[#<Nokogiri::XML::Text:0x8066aac8 "\n  ">, #<Nokogiri::XML::CDATA:0x8066a9c4 "\n  Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly     creme brulee.\n  ">, #<Nokogiri::XML::Text:0x8066a8d4 "\n">]>
    >> r.text
    => "\n  \n  Anna Kendrick is ''obsessed'' with 'Game of Thrones' and loves to cook, particularly     creme brulee.\n  \n"
    

    and you'll be able to get at the CDATA through r.text or r.children.

    0 讨论(0)
提交回复
热议问题