问题
In my C# project, I have been dealt with the task of parsing an SGML file and have tried, very naively, to use XmlReader, and this has led to some interesting revelations (i.e., the difference between SGML and well-formed XML, etc.)
So I am thinking that I just need a good SGML parser which converts it to an XML file and go from there. In my search, I have found two SGML parsers that can integrate with my C# project:
- MSDN's SgmlReader, and
- James Clark's SP SGML parser.
Any other recommendations?
回答1:
Apparently SgmlReader's updated here:
https://github.com/MindTouch/SGMLReader
回答2:
HTML is an implementation of SGML. If you want to parse HTML properly, you will need an SGML parser. SGMLreader appears to fit those needs well, and I plan to use it myself. I would suggest using HTML tidy. It is a native application, but .net bindings for it do exist. If you need entirely managed code, then the SGMLreader is the way to go.
来源:https://stackoverflow.com/questions/1148083/sgml-parser-net-recommendations