Xerces DOM parser incredibly slow?

后端 未结 2 1676
暗喜
暗喜 2021-01-14 17:16

Currently, I am trying to clean up an HTML file using JTidy, convert it to XHTML and provide the results to a DOM parser. The following code is the result of these efforts:<

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-14 17:47

    HTML dtd's are huge, using includes. They take forever. Use an XML catalog. There one can store the dtds locally and map them by their system ID.

    If you use a tool, like maven, you will find sufficient pointers.

    The advantage i.o. intercepting entities as the accepted answer suggests, is that you receive the correct characters.

提交回复
热议问题