tidy

Is there an alternative to HTML Tidy?

故事扮演 提交于 2019-12-03 11:10:53
问题 I have embedded HTML Tidy in my application to clean incoming HTML. But Tidy has a huge amount of bugs and fixing them directly in the source is my worst nightmare. Tidy source code is an unreadable abomination . Thousand+ line functions, poor variable naming, spaghetti code etc. It's truly horrible. Worse yet, official development seems to have ceased. In the last 12 months, there have been three write transactions to the official CVS repo. But it's been dead and buried for much longer than

Komodo Edit - HTML Reformatting / Tidy

最后都变了- 提交于 2019-12-03 04:06:59
Is there a simple way to reformat my HTML from within Komodo Edit or to automate the process against Tidy? Something like the Ctrl + K , Ctrl + D in Visual Studio would be brilliant. Presently running Ubuntu with Tidy installed. If you want a solution that just straight up works, do the following: Pop open the toolbox panel on the right Click on the gear and select New Macro, name it what you like. Get the macro code here: komodo edit macro It includes the code from http://jsbeautifier.org/ and works like a charm... Next is to set up a keystroke: Select your new macro in the toolbox Now go to

Is there an alternative to HTML Tidy?

こ雲淡風輕ζ 提交于 2019-12-03 01:37:22
I have embedded HTML Tidy in my application to clean incoming HTML. But Tidy has a huge amount of bugs and fixing them directly in the source is my worst nightmare. Tidy source code is an unreadable abomination . Thousand+ line functions, poor variable naming, spaghetti code etc. It's truly horrible. Worse yet, official development seems to have ceased . In the last 12 months, there have been three write transactions to the official CVS repo. But it's been dead and buried for much longer than that... So I'm looking for an OSS C or C++ application/library that can do what Tidy can (when it

Are there JavaScript or Ruby versions of “HTML tidy”? [closed]

无人久伴 提交于 2019-12-02 03:06:16
Does there exist a library similar to HTML tidy (http://tidy.sourceforge.net/) that is not OS specific (needs to be compiled on each host). Basically i just want to validate/clean the HTML sent to me by the user. <p>hello</p></p><br> should become <p>hello</p> <br/> Something in javascript or ruby would work for me. Thanks! In Ruby you can parse the HTML in Nokogiri, which will let you check for errors, then have it output the HTML, which will clean up missing closing tags and such. Notice in the following HTML that the title and p tags are not closed correctly, but Nokogiri adds the ending

Proper usage of JTidy to purify HTML

自闭症网瘾萝莉.ら 提交于 2019-12-01 06:05:57
I am trying to use JTidy (jtidy-r938.jar) to sanitize an input HTML string, but I seem to have problems getting the default settings right. Often strings such as "hello world" end up as "helloworld" after tidying. I wanted to show what I'm doing here, and any pointers would be really appreciated: Assume that rawHtml is the String containing the input (real world) HTML. This is what I'm doing: Tidy tidy = new Tidy(); tidy.setPrintBodyOnly(true); ByteArrayOutputStream baos = new ByteArrayOutputStream(); PrintStream ps = new PrintStream(baos); tidy.parse(new StringReader(rawHtml), ps); return

Proper usage of JTidy to purify HTML

隐身守侯 提交于 2019-12-01 03:45:04
问题 I am trying to use JTidy (jtidy-r938.jar) to sanitize an input HTML string, but I seem to have problems getting the default settings right. Often strings such as "hello world" end up as "helloworld" after tidying. I wanted to show what I'm doing here, and any pointers would be really appreciated: Assume that rawHtml is the String containing the input (real world) HTML. This is what I'm doing: Tidy tidy = new Tidy(); tidy.setPrintBodyOnly(true); ByteArrayOutputStream baos = new

Prevent tidy from adding html tags

房东的猫 提交于 2019-11-30 05:02:16
I have a class that generates some html (form elements and table elements), but this class returns all the html in one line. So I am trying to use tidy to beautify the code (indent the code, put line breaks, etc), the only problem I am having is that's also generating the tags I don't want. Here is the code: tidy_parse_string( $table->getHtml(), array( 'DocType' => 'omit', 'indent' => true, 'indent-spaces' => 4, 'wrap' => 0 ) ); The only way I have found to remove the extra html tags is by adding a str_replace, something like this: str_replace(array('<html>','</html>','<body>','</body>','<head

Php Tidy : remove link and style tags inside body

China☆狼群 提交于 2019-11-29 16:46:16
I must cleanup some HTML code to remove <style> and <link> tags inside the <body> tag. I'm already using PHP Tidy to do some cleanup but I did not found how to remove those tags with PHP Tidy. Do you have a solution ? Or maybe another markup cleaner PHP class... Don't know how to do that with Tidy, but you can use DOM $dom = new DOMDocument; // init new DOMDocument $dom->loadHTML($html); // load HTML into it $xpath = new DOMXPath($dom); // create a new XPath $nodes = $xpath->query('//body/style'); // Find all style elements in body tag foreach($nodes as $node) { // Iterate over found elements

How do I get HTML Tidy to not put newline before closing tags?

大憨熊 提交于 2019-11-29 03:13:40
HTML Tidy has this infuriating habit of putting a newline before the closing tag. For example: <p>Some text</p> becomes <p>Some text </p> How do I tell Tidy to keep the closing tag on the same line as the end of the content? Btw, I am running Tidy through Notepad++, if that makes any difference. Make sure vertical-space is set to no . After much frustration I learned the only thing that switch does is screw up your already somewhat-nicely formatted html by adding newlines where you don't want them. This is what I use for minimally-invasive tidying (no adding doctypes/head tags, etc.): tidy

Prevent tidy from adding html tags

倾然丶 夕夏残阳落幕 提交于 2019-11-29 02:34:03
问题 I have a class that generates some html (form elements and table elements), but this class returns all the html in one line. So I am trying to use tidy to beautify the code (indent the code, put line breaks, etc), the only problem I am having is that's also generating the tags I don't want. Here is the code: tidy_parse_string( $table->getHtml(), array( 'DocType' => 'omit', 'indent' => true, 'indent-spaces' => 4, 'wrap' => 0 ) ); The only way I have found to remove the extra html tags is by