问题
I have following code (C#), it takes too long and it throws exception:
new XmlDocument().
LoadXml("<?xml version='1.0' ?><!DOCTYPE note SYSTEM 'http://someserver/dtd'><note></note>");
I understand why it does that. My question is how do I make it stop? I don't care about DTD validation. I suppose I could just regex-replace it, but I am looking for more elegant solution.
Background:
The actual XML is received from a web site I do not own. When site is undergoing maintenance it returns XML with DOCTYPE that points to the DTD that's not available during maintenance. So my service gets unnecessary slow because it tries to get DTD for each XML I need to parse.
Here is exception stack:
Unhandled Exception: System.Net.WebException: The remote name could not be resolved: 'someserver'
at System.Net.HttpWebRequest.GetResponse()
at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials credentials)
at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials credentials)
at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type ofObjectToReturn)
at System.Xml.XmlTextReaderImpl.OpenStream(Uri uri)
at System.Xml.XmlTextReaderImpl.DtdParserProxy_PushExternalSubset(String systemId, String publicId)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.System.Xml.IDtdParserAdapter.PushExternalSubset(String systemId, String publicId)
at System.Xml.DtdParser.ParseExternalSubset()
at System.Xml.DtdParser.ParseInDocumentDtd(Boolean saveInternalSubset)
at System.Xml.DtdParser.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.ParseDoctypeDecl()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.LoadXml(String xml)
at ConsoleApplication36.Program.Main(String[] args) in c:\Projects\temp\ConsoleApplication36\Program.cs:line 11
回答1:
Well, in .NET 4.0 XmlTextReader has a property called DtdProcessing. When set to DtdProcessing.Ignore it should disable DTD processing.
回答2:
In .net 4.5.1 I had no luck setting doc.XmlResolver to null.
The easiest fix for me was to use a string replacement to change "xmlns=" to "ignore=" before calling LoadXml(), e.g.
var responseText = await response.Content.ReadAsStringAsync();
responseText = responseText.Replace("xmlns=", "ignore=");
try
{
var doc = new XmlDocument();
doc.LoadXml(responseText);
...
}
来源:https://stackoverflow.com/questions/4445348/net-prevent-xmldocument-loadxml-from-retrieving-dtd