XML Exception: Invalid Character(s)

前端未结

关注

 7  1539

I am working on a small project that is receiving XML data in string form from a long running application. I am trying to load this string data into an XDocument

相关标签:

7条回答

情深已故

2020-12-10 04:11

XML can handle just about any character, but there are ranges, control codes and such, that it won't.

Your best bet, if you can't get them to fix their output, is to sanitize the raw data you're receiving. You need replace illegal characters with the character reference format you noted.

(You can't even resort to CDATA, as there is no way to escape these characters there.)

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2020-12-10 04:11
You can use the XmlReader and set the XmlReaderSettings.CheckCharacters property to false. This will let you to read the XML file despite the invalid characters. From there you can import pass it to a XmlDocument or XDocument object.

You can read a little more about in my blog.

To load the data to a System.Xml.Linq.XDocument it will look a little something like this:
```
XDocument xDocument = null;
XmlReaderSettings xmlReaderSettings = new XmlReaderSettings { CheckCharacters = false };
using (XmlReader xmlReader = XmlReader.Create(filename, xmlReaderSettings))
{
    xmlReader.MoveToContent();
    xDocument = XDocument.Load(xmlReader);
}
```
More information can be found here.
0 讨论(0)
发布评论:

提交评论
- 加载中...
滥情空心

2020-12-10 04:16

If you really can't fix the source XML data, consider taking an approach like I described in this answer. Basically, you create a TextReader subclass (e.g StripTextReader) that wraps an existing TextReader (tr) and discards invalid characters.

0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2020-12-10 04:18

If your input is not XML, you should use something like Tidy or Tagsoup to clean the mess up.

They would take any input and try, hopefully, to make a useful DOM from it.

I don't know how relevant dark side libraries are called.

0 讨论(0)
发布评论:

提交评论
- 加载中...
醉梦人生

2020-12-10 04:21

IMHO the best solution would be to modify the code/program/whatever produced the invalid XML that is being fed to your program. Unfortunately this is not always possible. In this case you need to escape all characters < 0x20 before trying to load the document.

0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2020-12-10 04:27

Would something as described in this blog post be helpful?

Basically, he creates a sanitizing xml stream.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页