Error parsing a DTD using lxml

て烟熏妆下的殇ゞ 提交于 2019-12-03 21:11:21

I took a look at the nitf-3-4.dtd and found that it references an external module xhtml-ruby-1.mod which can be downloaded at this link. This needs to be present in the current directory so the DTD parser can load it.

Full working example (assuming you have a valid NITF document handy):

% wget http://www.iptc.org/std/NITF/3.4/specification/dtd/nitf-3-4.dtd
% wget http://www.iptc.org/std/NITF/3.4/specification/dtd/xhtml-ruby-1.mod

Python code:

from lxml import etree, objectify
dtd = etree.DTD(open('nitf-3-4.dtd', 'rb'))
tree = objectify.parse(open('nitf_test.xml', 'rb'))
print dtd.validate(tree)

Output:

% python nitf_test.py
True
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!