问题
Now i ran into some stupid situation. I want the users to be able to use textile, but they shouldn't mess around with my valid HTML around their entry. So I have to escape the HTML somehow.
html_escape(textilize("</body>Foo"))
would break textile whiletextilize(html_escape("</body>Foo"))
would work, but breaks various Textile features like links (written like"Linkname":http://www.wheretogo.com/
), since the quotes would be transformed into"
and thus not detected by textile anymore.sanitize
doesn't do a better job.
Any suggestions on that one? I would prefer not to use Tidy for this problem. Thanks in advance.
回答1:
For those who run into the same problem: If you are using the RedCloth gem you can just define your own method (in one of your helpers).
def safe_textilize( s ) if s && s.respond_to?(:to_s) doc = RedCloth.new( s.to_s ) doc.filter_html = true doc.to_html end end
Excerpt from the Documentation:
Accessors for setting security restrictions.
This is a nice thing if you‘re using RedCloth for formatting in public places (e.g. Wikis) where you don‘t want users to abuse HTML for bad things.
If
filter_html
is set, HTML which wasn‘t created by the Textile processor will be escaped. Alternatively, ifsanitize_html
is set, HTML can pass through the Textile processor but unauthorized tags and attributes will be removed.
回答2:
This works for me and guards against every XSS attack I've tried including onmouse... handlers in pre and code blocks:
<%= RedCloth.new( sanitize( @comment.body ), [:filter_html, :filter_styles, :filter_classes, :filter_ids] ).to_html -%>
The initial sanitize removes a lot of potential XSS exploits including mouseovers.
As far as I can tell :filter_html escapes most html tags apart from code and pre. The other filters are there because I don't want users applying any classes, ids and styles.
I just tested my comments page with your example
"</body>Foo"
and it completely removed the rogue body tag
I am using Redcloth version 4.2.3 and Rails version 2.3.5
回答3:
Looks like textile simply doesn't support what you want.
You really want to only allow a carefully controlled subset of HTML, but textile is designed to allow arbitrary HTML. I don't think you can use textile at all in this situation (unless it supports that kind of restriction).
What you need is probably a special "restricted" version of textile, that only allows "safe" markup (defining that however might already be tricky). I do not know if that exists, however.
You might have a look at BBCode, that allows to restrict the possible markup.
来源:https://stackoverflow.com/questions/501737/how-do-i-textile-and-sanitize-html