How to add non-escaped ampersands to HTML with Nokogiri::XML::Builder

后端 未结 2 1273
长情又很酷
长情又很酷 2021-01-20 06:52

I would like to add things like bullet points \"•\" to HTML using the XML Builder in Nokogiri, but everything is being escaped. How do I prevent it from being escaped

相关标签:
2条回答
  • 2021-01-20 07:50

    When you're setting the text of an element, you really are setting text, not HTML source. < and & don't have any special meaning in plain text.

    So just type a bullet: '•'. Of course your source code and your XML file will have to be using the same encoding for that to come out right. If your XML file is UTF-8 but your source code isn't, you'd probably have to say '\xe2\x80\xa2' which is the UTF-8 byte sequence for the bullet character as a string literal.

    (In general non-ASCII characters in Ruby 1.8 are tricky. The byte-based interfaces don't mesh too well with XML's world of all-text-is-Unicode.)

    0 讨论(0)
  • 2021-01-20 07:54

    If you define

      class Nokogiri::XML::Builder
        def entity(code)
          doc = Nokogiri::XML("<?xml version='1.0'?><root>&##{code};</root>")
          insert(doc.root.children.first)
        end
      end
    

    then this

      builder = Nokogiri::XML::Builder.new do |xml|
        xml.span {
          xml.text "I can has "
          xml.entity 8665
          xml.text " entity?"
        }
      end
      puts builder.to_xml
    

    yields

    <?xml version="1.0"?>
    <span>I can has &#x2022; entity?</span>
    

     

    PS this a workaround only, for a clean solution please refer to the libxml2 documentation (Nokogiri is built on libxml2) for more help. However, even these folks admit that handling entities can be quite ..err, cumbersome sometimes.

    0 讨论(0)
提交回复
热议问题