XElement & UTF-8 Issue

百般思念 提交于 2020-01-04 17:49:47

问题


I have a .NET Web Service(.asmx, not .svc) that accepts a string via HTTP POST. The strings it accepts are xml infosets I then parse via XElement.Parse. Once parsed into an XElement instance, I add a node to one of the elements within the instance.

The problem I'm having is that if a string representing an xml infoset comes through with then for some reason, me adding a node to the element XElement throws an exception such as "' ', hexadecimal value 0x06, is an invalid character. Line 1, position 40.". I get a wide array of 0x(*) errors thrown. If I don't attempt to add nodes to the XElement, everythings fine. Here's how I'm adding the element:

var prospect = doc.Element("prospect");
var provider = prospect.Element("provider");

provider.Add(new XElement("id",
    new XAttribute("reservation-code",
    reservationCode)
));

Is there some sort of string conversion I ought to be doing somewhere?


回答1:


XML does not allow some Unicode characters. See the XML 1.0 Specification. Unfortunately, there is no standard way to escape those characters in XML, too. For example, you cannot escape it in valid XML using 	 because of the Well-formedness constraint: Legal Character (see character references).

The XElement.ToString() has the check for those characters turned on. However, .NET does provide a way to turn character checking off. It is off by default in the System.Xml.XmlWriter instances. Therefore the following code will work:

    /// <summary>
    /// Returns the XML string of the <paramref name="xElement"/> WITHOUT CHARACTER CHECKING.
    /// </summary>
    /// <param name="xElement"></param>
    /// <returns></returns>
    public static string ToStringWithoutCharacterChecking(this XElement xElement)
    {
        using (System.IO.StringWriter stringWriter = new System.IO.StringWriter())
        {
            using (System.Xml.XmlTextWriter xmlTextWriter = new XmlTextWriter(stringWriter))
            {
                xElement.WriteTo(xmlTextWriter);
            }
            return stringWriter.ToString();
        }
    }

Notice however that if you create an System.Xml.XmlWriter instance using System.Xml.XmlWriterSettings, the default is true for character checking. Therefore if you use System.Xml.XmlWriterSettings and want to turn off character checking, use:

XmlWriterSettings s = new XmlWriterSettings();
s.CheckCharacters = false;
using(XmlWriter w = XmlWriter.Create(..., s))
{
    //etc.
}



回答2:


thanks a lot, which solved my problem when I using linq to xsd. here is my code: //not using container.Save(new StreamWriter(toStream, new UTF8Encoding(false))); instead using codes:

using (XmlWriter w = XmlWriter.Create(new StreamWriter(toStream, new UTF8Encoding(false)), new XmlWriterSettings
            {//http://stackoverflow.com/questions/5709831/xelement-utf-8-issue
                //http://stackoverflow.com/questions/10057171/xdocument-prevent-invalid-charachters
                Indent = true,
                CheckCharacters = false
            }))
            {
                XTypedServices.Save(w, container.Untyped);
            }

            toStream.Flush();


来源:https://stackoverflow.com/questions/5709831/xelement-utf-8-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!