Is it possible to use HTML Tidy to just indent HTML code?
Sample Code
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" />
</li>
<li><input class="submit" type="submit" value="Search" /></li>
</ul>
</form>
Desired Result
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"/>
</li>
<li><input class="submit" type="submit" value="Search"/></li>
</ul>
</form>
If I run it with the standard command, tidy -f errs.txt -m index.html
then I get this
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<meta name="generator" content=
"HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 15.3.6), see www.w3.org">
<title></title>
</head>
<body>
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li><label class="screenReader" for=
"q">Keywords</label><input type="text" name="q" value="" id=
"q"></li>
<li><input class="submit" type="submit" value="Search"></li>
</ul>
</form>
</body>
</html>
How can I omit all the extra stuff and actually get it to indent the code?
Forgive me if that's not a feature that it's supposed to support, what library / tool am I looking for?
Use a config file with just the indent
, tidy-mark
, and quiet
options:
indent: auto
indent-spaces: 2
quiet: yes
tidy-mark: no
Name it tidy_config.txt
and save it the same directory as the .html file. Run it like this:
tidy -config tidy_config.txt index.html
For more customization, use the tidy man page to find other relevant options such as markup: no
or force-output: yes
.
I didn't found a possibility "only reindent - without any changes". The next config file will "repair" as low as possible and (mostly) only re-indent the html. Tidy
still correcting some errorish conditions, like duplicated (repeated) attributes.
#based on http://tidy.sourceforge.net/docs/quickref.html
#HTML, XHTML, XML Options Reference
anchor-as-name: no #?
doctype: omit
drop-empty-paras: no
fix-backslash: no
fix-bad-comments: no
fix-uri:no
hide-endtags: yes #?
#input-xml: yes #?
join-styles: no
literal-attributes: yes
lower-literals: no
merge-divs: no
merge-spans: no
output-html: yes
preserve-entities: yes
quote-ampersand: no
quote-nbsp: no
show-body-only: auto
#Diagnostics Options Reference
show-errors: 0
show-warnings: 0
#Pretty Print Options Reference
break-before-br: yes
indent: yes
indent-attributes: no #default
indent-spaces: 4
tab-size: 4
wrap: 132
wrap-asp: no
wrap-jste: no
wrap-php: no
wrap-sections: no
#Character Encoding Options Reference
char-encoding: utf8
#Miscellaneous Options Reference
force-output: yes
quiet: yes
tidy-mark: no
For example the next html-fragment
<div>
<div>
<p>
not closed para
<h1>
h1 head
</h1>
<ul>
<li>not closed li
<li>closed li</li>
</ul>
some text
</div>
</div>
will changed to
<div>
<div>
<p>
not closed para
<h1>
h1 head
</h1>
<ul>
<li>not closed li
<li>closed li
</ul>some text
</div>
</div>
As you can notice, the hide-endtags: yes
hides the closing </li>
from the second bullet in the input. Setting the hide-endtags: no
- will get the next:
<div>
<div>
<p>
not closed para
</p>
<h1>
h1 head
</h1>
<ul>
<li>not closed li
</li>
<li>closed li
</li>
</ul>some text
</div>
</div>
so, tidy
adds closing </p>
and closing </li>
to first bullet.
I didn't found a possibility preserve everything on input and only reindent the file.
You need the following option:
tidy --show-body-only yes -i 4 -w 80 -m file.html
http://tidy.sourceforge.net/docs/quickref.html#show-body-only
-i 4
- indents 4 spaces (tidy never uses tabs)
-w 80
- wrap at column 80 (default on my system: 68, very narrow)
-m
- modify file inplace
(you may want to leave out the last option, and examine the output first)
Showing only body, will naturally leave out the tidy-mark
(generator meta
).
Another cool options are:
--quiet yes
- doesn't print W3C advertisements and other unnecessary output
(errors still reported)
I am very late to the party :)
But in your tidy config file set
tidy-mark: no
by default this is set to yes.
Once done, tidy will not add meta generator tag to your html.
To answer the poster's original question, using Tidy to just indent HTML code, here's what I use:
tidy --indent auto --quiet yes --show-body-only auto --show-errors 0 --wrap 0 input.html
input.html
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" />
</li>
<li><input class="submit" type="submit" value="Search" /></li>
</ul>
</form>
Output:
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li><label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"></li>
<li><input class="submit" type="submit" value="Search"></li>
</ul>
</form>
No extra HTML code added. Errors are suppressed. To find out what each option does, it's best to refer to the official reference.
来源:https://stackoverflow.com/questions/7151180/use-html-tidy-to-just-indent-html-code