问题
Is it possible to use HTML Tidy to just indent HTML code?
Sample Code
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" />
</li>
<li><input class="submit" type="submit" value="Search" /></li>
</ul>
</form>
Desired Result
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"/>
</li>
<li><input class="submit" type="submit" value="Search"/></li>
</ul>
</form>
If I run it with the standard command, tidy -f errs.txt -m index.html
then I get this
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<meta name="generator" content=
"HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 15.3.6), see www.w3.org">
<title></title>
</head>
<body>
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li><label class="screenReader" for=
"q">Keywords</label><input type="text" name="q" value="" id=
"q"></li>
<li><input class="submit" type="submit" value="Search"></li>
</ul>
</form>
</body>
</html>
How can I omit all the extra stuff and actually get it to indent the code?
Forgive me if that's not a feature that it's supposed to support, what library / tool am I looking for?
回答1:
Use a config file with just the indent
, tidy-mark
, and quiet
options:
indent: auto
indent-spaces: 2
quiet: yes
tidy-mark: no
Name it tidy_config.txt
and save it the same directory as the .html file. Run it like this:
tidy -config tidy_config.txt index.html
For more customization, use the tidy man page to find other relevant options such as markup: no
or force-output: yes
.
回答2:
I didn't found a possibility "only reindent - without any changes". The next config file will "repair" as low as possible and (mostly) only re-indent the html. Tidy
still correcting some errorish conditions, like duplicated (repeated) attributes.
#based on http://tidy.sourceforge.net/docs/quickref.html
#HTML, XHTML, XML Options Reference
anchor-as-name: no #?
doctype: omit
drop-empty-paras: no
fix-backslash: no
fix-bad-comments: no
fix-uri:no
hide-endtags: yes #?
#input-xml: yes #?
join-styles: no
literal-attributes: yes
lower-literals: no
merge-divs: no
merge-spans: no
output-html: yes
preserve-entities: yes
quote-ampersand: no
quote-nbsp: no
show-body-only: auto
#Diagnostics Options Reference
show-errors: 0
show-warnings: 0
#Pretty Print Options Reference
break-before-br: yes
indent: yes
indent-attributes: no #default
indent-spaces: 4
tab-size: 4
wrap: 132
wrap-asp: no
wrap-jste: no
wrap-php: no
wrap-sections: no
#Character Encoding Options Reference
char-encoding: utf8
#Miscellaneous Options Reference
force-output: yes
quiet: yes
tidy-mark: no
For example the next html-fragment
<div>
<div>
<p>
not closed para
<h1>
h1 head
</h1>
<ul>
<li>not closed li
<li>closed li</li>
</ul>
some text
</div>
</div>
will changed to
<div>
<div>
<p>
not closed para
<h1>
h1 head
</h1>
<ul>
<li>not closed li
<li>closed li
</ul>some text
</div>
</div>
As you can notice, the hide-endtags: yes
hides the closing </li>
from the second bullet in the input. Setting the hide-endtags: no
- will get the next:
<div>
<div>
<p>
not closed para
</p>
<h1>
h1 head
</h1>
<ul>
<li>not closed li
</li>
<li>closed li
</li>
</ul>some text
</div>
</div>
so, tidy
adds closing </p>
and closing </li>
to first bullet.
I didn't found a possibility preserve everything on input and only reindent the file.
回答3:
You need the following option:
tidy --show-body-only yes -i 4 -w 80 -m file.html
http://tidy.sourceforge.net/docs/quickref.html#show-body-only
-i 4
- indents 4 spaces (tidy never uses tabs)
-w 80
- wrap at column 80 (default on my system: 68, very narrow)
-m
- modify file inplace
(you may want to leave out the last option, and examine the output first)
Showing only body, will naturally leave out the tidy-mark
(generator meta
).
Another cool options are:
--quiet yes
- doesn't print W3C advertisements and other unnecessary output
(errors still reported)
回答4:
I am very late to the party :)
But in your tidy config file set
tidy-mark: no
by default this is set to yes.
Once done, tidy will not add meta generator tag to your html.
回答5:
To answer the poster's original question, using Tidy to just indent HTML code, here's what I use:
tidy --indent auto --quiet yes --show-body-only auto --show-errors 0 --wrap 0 input.html
input.html
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li>
<label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" />
</li>
<li><input class="submit" type="submit" value="Search" /></li>
</ul>
</form>
Output:
<form action="?" method="get" accept-charset="utf-8">
<ul>
<li><label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"></li>
<li><input class="submit" type="submit" value="Search"></li>
</ul>
</form>
No extra HTML code added. Errors are suppressed. To find out what each option does, it's best to refer to the official reference.
来源:https://stackoverflow.com/questions/7151180/use-html-tidy-to-just-indent-html-code