What libraries will parse a DTD using PHP

房东的猫 提交于 2019-12-04 15:55:24

None of the standard XML parsers for PHP give access to general entities*, and few give access to comments. PHP's built in XML Parser uses Expat, but does not expose the full expat API; in particular, a handler for entities cannot be set. There is a PHP bug filed to add this.

AFAICT, the only way to handle comments and general entities in a DTD parser is to write your own parser; either by hand, or using one of the lexers and parser generators available for php (e.g. PHP_LexerGenerator and PHP_ParserGenerator among others).

* PHP's expat wrapper (XML Parser) does give access to notation declarations, which are similar to, but not the same as general entities.

I don't know useful this will be...

If I understand what you're looking for, you're looking for a means to extract the and "nodes" from a DTD in order to act on them. Very interesting. Here's where my brain went:

  • Use DOMDocument class directly. Looks as if there's no distinct way of getting at the DTD data if you treat the DTD as the source.
  • Use the SimpleXML in the same way. Ditto.
  • Use the XML parser in, again, the same way but use some of the entity declaration handler functions to get information out. I think this proves more foresight and is probably not what you need. (Although I could be wrong.)
  • Use preg_match_all, or the like, to grab your values based on the patterns. Not to dissimilar to other thoughts in the world.
  • Use XSLT to nix everything but what you need. The .xsl to remove all non-comments would be pretty easy to manage. It's quite possible you could just output them in a format that's easier to parse (say, in a better XML structure). Entities may require handling via PHP's XSL processor. I'm a little rusty on entities.

Regardless, I hope some of this helps.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!