Secure XSS cleaning function (updated regularly)

后端 未结 2 477
醉梦人生
醉梦人生 2020-12-24 08:31

I\'ve been hunting around the net now for a few days trying to figure this out but getting conflicting answers.

Is there a library, class or function for PHP

相关标签:
2条回答
  • 2020-12-24 09:13

    To answer the bold question: Yes, there is. It's called htmlspecialchars.

    It needs to be updated regularly to counter new attacks.

    The right way to prevent XSS attacks is not countering specific attacks, filtering/sanitizing data, but proper encoding, everywhere.

    htmlspecialchars (or htmlentities) in conjunction with a reasonable decision of character encoding (i.e. UTF-8) and explicit specification of character encoding is sufficient to prevent against all XSS attacks. Fortunately, calling htmlspecialchars without explicit encoding(it then assumes ISO-8859-1) happens to work out for UTF-8, too. If you want to make that explicit, create a helper function:

    // Don't forget to specify UTF-8 as the document's encoding
    function htmlEncode($s) {
        return htmlspecialchars($s, ENT_QUOTES, 'UTF-8');
    }
    

    Oh, and to address the form worries: Don't try to detect encodings, it's bound to fail. Instead, give out the form in UTF-8. Every browser will send user inputs in UTF-8 then.

    Addressing specific concerns:

    (...) you're supposed to use htmlentities because htmlspecialchars is vulnerable to UTF-7 XSS exploit.

    The UTF-7 XSS exploit can only be applied if the browser thinks a document is encoded in UTF-7. Specifying the document encoding as UTF-8 (in the HTTP header/a meta tag right after <head>) prevents this.

    Also if I don't detect the encoding, what's to stop an attacker downloading the html file, then altering it to UTF-7 or some other encoding, then submitting the POST request back to my server from the altered html page?

    This attack scenario is unnecessarily complex. The attacker could just craft a UTF-7 string, no need to download anything.

    If you accept the attacker's POST (i.e. you're accepting anonymous public user input), your server will just interpret the UTF-7 string as a weird UTF-8 one. That is not a problem, the attacker's post will just show garbled. The attacker could achieve the same effect (sending strange text) by submitting "grfnlk" a hundred times.

    If my method only works for UTF-8 then the XSS attack will get through, no?

    No, it won't. Encodings are not magic. An encoding is just a way to interpret a binary string. For example, the string "ö" is encoded as (hexadecimal) 2B 41 50 59 in UTF-7 (and C3 B6 in UTF-8). Decoding 2B 41 50 59 as UTF-8 yields "+APY" - harmless, seemingly randomly characters.

    Also how does htmlentities protect against HEX or other XSS attacks?

    Hexadecimal data will be outputted as just that. An attacker sending "3C" will post a message "3C". "3C" can only become < if you actively try to interpret hexadecimal inputs otherwise, for example actively map them into unicode code points and then output them. That just means if you're accepting data in something but plain UTF-8 (for example base32-encoded UTF-8), you'll first have to unpack your encoding, and then use htmlspecialchars before including it between HTML code.

    0 讨论(0)
  • 2020-12-24 09:15

    Lots of security engineers are recommending to use this library for this specific problem :

    https://www.owasp.org/index.php/Category:OWASP_Enterprise_Security_API

    0 讨论(0)
提交回复
热议问题