Best practice for handling vertical tabs and other invalid xml characters

前端 未结 2 1352
日久生厌
日久生厌 2021-02-18 23:09

I have an application which (like many others) takes in user input, stores it in a database and then later processes it using (amongst other things) XML tools. The application t

相关标签:
2条回答
  • 2021-02-18 23:41

    Yes, unfortunately some characters are illegal in XML, and have no entity equivalent. As one of those examples, see:

    http://www.jdom.org/docs/apidocs.1.1/org/jdom/Element.html#setText(java.lang.String)
    

    which is a String setter... that can throw an exception! Vertical tab is exactly one of those characters for which there is no XML entity, nor a way to "escape" it with XML alone.

    I'm working around this myself by using base64 encoding to sanitize strings that might harbor those characters. It's a bit silly, since I have to base64-encode and decode all the time, but I don't think there's a good alternative.

    0 讨论(0)
  • 2021-02-18 23:48

    You should escape them using amperstand (� through &#0x1F), then decode/restore them at the end.

    See XmlTextWriter incorrectly writing control characters

    0 讨论(0)
提交回复
热议问题