问题
There is a public webservice which I want to use in a short C# Application: http://ws.parlament.ch/
The returned XML from this webservice has a "BOM" at the beginning, which causes RESTSharp to fail the deserializing of the XML with the following error message:
Error retrieving response. Check inner details for more info. ---> System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1. at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg) at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace() at System.Xml.XmlTextReaderImpl.ParseDocumentContent() at System.Xml.XmlTextReaderImpl.Read() at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options) at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text) at RestSharp.Deserializers.XmlDeserializer.Deserialize[T](IRestResponse response) at RestSharp.RestClient.Deserialize[T](IRestRequest request, IRestResponse raw)
--- End of inner exception stack trace ---
Here is an easy sample by using http://ws.parlament.ch/sessions?format=xml to get a List of 'Sessions':
public class Session
{
public int Id { get; set; }
public DateTime? Updated { get; set; }
public int? Code { get; set; }
public DateTime? From { get; set; }
public string Name { get; set; }
public DateTime? To { get; set; }
}
static void Main(string[] args)
{
var request = new RestRequest();
request.RequestFormat = DataFormat.Xml;
request.Resource = "sessions";
request.AddParameter("format", "xml");
var client = new RestClient("http://ws.parlament.ch/");
var response = client.Execute<List<Session>>(request);
if (response.ErrorException != null)
{
const string message = "Error retrieving response. Check inner details for more info.";
var ex = new ApplicationException(message, response.ErrorException);
Console.WriteLine(ex);
}
List<Session> test = response.Data;
Console.Read();
}
When I first manipulate the returned xml with Fiddler to remove the first 3 bits (the "BOM"), the above code works! May someone please help me to handle this directly in RESTSharp? What am I doing wrong? THANK YOU in advance!
回答1:
I found the Solution - Thank you @arootbeer for the hints!
Instead of wrapping the XMLDeserializer, you can also use the 'RestRequest.OnBeforeDeserialization' event from #RESTSharp. So you just need to insert something like this after the new RestRequest() (see my initial code example) and then it works perfect!
request.OnBeforeDeserialization = resp =>
{
//remove the first ByteOrderMark
//see: http://stackoverflow.com/questions/19663100/restsharp-has-problems-deserializing-xml-including-byte-order-mark
string byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
if (resp.Content.StartsWith(byteOrderMarkUtf8))
resp.Content = resp.Content.Remove(0, byteOrderMarkUtf8.Length);
};
回答2:
I had this same problem, but not specifically with RestSharp. Use this:
var responseXml = new UTF8Encoding(false).GetString(bytes);
Original discussion: XmlReader breaks on UTF-8 BOM
Pertinent quote from the answer:
The xml string must not (!) contain the BOM, the BOM is only allowed in byte data (e.g. streams) which is encoded with UTF-8. This is because the string representation is not encoded, but already a sequence of unicode characters.
Edit:
Looking through their docs, it looks like the most straightforward way to handle this (aside from a GitHub issue) is to call the non-generic Execute()
method and deserialize the response from that string. You could also create an IDeserializer
that wraps the default XML deserializer.
回答3:
The solution that @dataCore posted doesn't quite work, but this one should.
request.OnBeforeDeserialization = resp => {
if (resp.RawBytes.Length >= 3 && resp.RawBytes[0] == 0xEF && resp.RawBytes[1] == 0xBB && resp.RawBytes[2] == 0xBF)
{
// Copy the data but with the UTF-8 BOM removed.
var newData = new byte[resp.RawBytes.Length - 3];
Buffer.BlockCopy(resp.RawBytes, 3, newData, 0, newData.Length);
resp.RawBytes = newData;
// Force re-conversion to string on next access
resp.Content = null;
}
};
Setting resp.Content
to null
is there as a safety guard, as RawBytes
is only converted to a string if Content
isn't already set to a value.
回答4:
To make it work with RestSharp you can parse response content manually and remove all the "funny" characters coming before the '<'.
var firstChar = responseContent[0];
// removing any 'funny' characters coming before '<'
while (firstChar != 60)
{
responseContent= responseContent.Remove(0, 1);
firstChar = responseContent[0];
}
XmlReader.Create(new StringReader(responseContent));
来源:https://stackoverflow.com/questions/19663100/restsharp-has-problems-deserializing-xml-including-byte-order-mark