Is Scala/Java not respecting w3 “excess dtd traffic” specs?

后端 未结 9 1189
青春惊慌失措
青春惊慌失措 2021-02-01 07:40

I\'m new to Scala, so I may be off base on this, I want to know if the problem is my code. Given the Scala file httpparse, simplified to:

object Http {
   import         


        
相关标签:
9条回答
  • 2021-02-01 08:40

    It works. After some detective work, the details as best I can figure them:

    Trying to parse a developmental RESTful interface, I build the parser and get the above (rather, a similar) error. I try various parameters to change the XML output, but get the same error. I try to connect to an XML document I quickly whip up (cribbed stupidly from the interface itself) and get the same error. Then I try to connect to anything, just for kicks, and get the same (again, likely only similar) error.

    I started questioning whether it was an error with the sources or the program, so I started searching around, and it looks like an ongoing issue- with many Google and SO hits on the same topic. This, unfortunately, made me focus on the upstream (language) aspects of the error, rather than troubleshoot more downstream at the sources themselves.

    Fast forward and the parser suddenly works on the original XML output. I confirmed that there was some additional work has been done server side (just a crazy coincidence?). I don't have either earlier XML but suspect that it is related to the document identifiers being changed.

    Now, the parser works fine on the RESTful interface, as well any well formatted XML I can throw at it. It also fails on all XHTML DTD's I've tried (e.g. www.w3.org). This is contrary to what @SeanReilly expects, but seems to jive with what the W3 states.

    I'm still new to Scala, so can't determine if I have a special, or typical case. Nor can I be assured that this problem won't re-occur for me in another form down the line. It does seem that pulling XHTML will continue to cause this error unless one uses a solution similar to those suggested by @GClaramunt $ @J-16 SDiZ have used. I'm not really qualified to know if this is a problem with the language, or my implementation of a solution (likely the later)

    For the immediate timeframe, I suspect that the best solution would've been for me to ensure that it was possible to parse that XML source-- rather than see that other's have had the same error and assume there was a functional problem with the language.

    Hope this helps others.

    0 讨论(0)
  • 2021-02-01 08:43

    Setting Xerces switches only works if you are using Xerces. An entity resolver works for any JAXP parser.

    There are more generalized entity resolvers out there, but this implementation does the trick when all I'm trying to do is parse valid XHTML.

    http://code.google.com/p/java-xhtml-cache-dtds-entityresolver/

    Shows how trivial it is to cache the DTDs and forgo the network traffic.

    In any case, this is how I fix it. I always forget. I always get the error. I always go fetch this entity resolver. Then I'm back in business.

    0 讨论(0)
  • 2021-02-01 08:44

    For scala 2.7.7 I managed to do this with scala.xml.parsing.XhtmlParser

    0 讨论(0)
提交回复
热议问题