Best practice for allowing Markdown in Python, while preventing XSS attacks?

前端未结

关注

 2  662

I need to let users enter Markdown content to my web app, which has a Python back end. I don’t want to needlessly restrict their entries (e.g. by not allowing any HTML,

相关标签:

2条回答

忘掉有多难

2021-01-30 23:00
I was unable to determine “best practice,” but generally you have three choices when accepting Markdown input:
1. Allow HTML within Markdown content (this is how Markdown originally/officially works, but if treated naïvely, this can invite XSS attacks).
2. Just treat any HTML as plain text, essentially letting your Markdown processor escape the user’s input. Thus … in input will not create small text but rather the literal text “…”.
3. Throw out all HTML tags within Markdown. This is pretty user-hostile and may choke on text like <3 depending on implementation. This is the approach taken here on Stack Overflow.
My question regards case #1, specifically.

Given that, what worked well for me is sending user input through
1. Markdown for Python, which optionally supports Extra syntax and then through
2. html5lib’s sanitizer.
I threw a bunch of XSS attack attempts at this combination, and all failed (hurray!); but using benign tags like  worked flawlessly.

This way, you are in effect going with option #1 (as desired) except for potentially dangerous or malformed HTML snippets, which are treated as in option #2.

(Thanks to Y.H Wong for pointing me in the direction of that Markdown library!)
0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2021-01-30 23:15

Markdown in Python is probably what you are looking for. It seems to cover a lot of your requested extensions too.

To prevent XSS attacks, the preferred way to do it is exactly the same as other languages - you escape the user output when rendered back. I just took a peek at the documentation and the source code. Markdown seems to be able to do it right out of the box with some trivial config tweaks.

0 讨论(0)
发布评论:

提交评论
- 加载中...