use XSLT to remove/replace invalid characters

前端 未结 2 717
-上瘾入骨i
-上瘾入骨i 2021-01-25 22:56

I am looking for as way to test a given element in the source XML and remove characters that are not valid. Basically I have a list of allowed characters and need a way to repl

相关标签:
2条回答
  • 2021-01-25 23:15

    You could use translate() or replace() (the latter XSLT2 only, I think), I suppose, but if the characters are invalid in the sense that the XML is no longer well-formed, then you can't use XSLT, as it requires at least a well-formed XML document.

    Using translate(), removing all characters except those specified in a list goes something like this:

    translate($string, translate($string,'0123456789',''),'')
    

    The above will remove everything not in the set 0123456789.

    The other answer shows a way of doing it using replace() and a regular expression.

    If you have control over whatever generates the XML, I would look there for a solution, BTW.

    0 讨论(0)
  • 2021-01-25 23:21

    You can use replace() as hinted above. Using your regular expression for valid characters, you could try this:

    replace($string,"[^0-9a-zA-Z/\-\?:\(\)\.,'\+ \r\n]+","")
    

    You can see that your regular expression is almost as it was, except that ^ has been added to turn the set of valid characters to its complement.

    0 讨论(0)
提交回复
热议问题