Proper usage of JTidy to purify HTML

自闭症网瘾萝莉.ら 提交于 2019-12-01 06:05:57

Have a look at how JTidy is configured:

StringWriter writer = new StringWriter();
tidy.getConfiguration().printConfigOptions(writer, true);
System.out.println(writer.toString());

Maybe it then get clear what causes the problem.

What is weird? Little example, of actual output and expected... maybe ?

Well, this seems to be a bug in Jtidy. For the exact file which causes problems, refer here:

http://sourceforge.net/tracker/?func=detail&aid=2985849&group_id=13153&atid=113153

Thanks for all the help folks!

Here is how we are calling JTidy from Ant. You may infer the API call from it:

<tidy destdir="${build.dir.result}">
  <fileset dir="${src}" includes="**/*.htm"/>
  <parameter name="tidy-mark" value="false"/>
  <parameter name="output-xml" value="no"/>
  <parameter name="numeric-entities" value="yes"/>
  <parameter name="indent-spaces" value="2"/>
  <parameter name="indent-attributes" value="no"/>
  <parameter name="markup" value="yes"/>
  <parameter name="wrap" value="2000"/>
  <parameter name="uppercase-tags" value="no"/>
  <parameter name="uppercase-attributes" value="no"/>
  <parameter name="quiet" value="no"/>
  <parameter name="clean" value="yes"/>
  <parameter name="show-warnings" value="yes"/>
  <parameter name="break-before-br" value="yes"/>
  <parameter name="hide-comments" value="yes"/>
  <parameter name="char-encoding" value="latin1"/>
  <parameter name="output-html" value="yes"/>
</tidy>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!