Using HTML Purifier on a site with only plain text input

后端 未结 1 1072
栀梦
栀梦 2021-01-17 02:35

I would appreciate an answer to settle a disagreement between me and some co-workers.

We have a typical PHP / LAMP web application.

The only input we want fr

1条回答
  •  一整个雨季
    2021-01-17 03:01

    As a general rule, escaping should be done for context and for use-case.

    If what you want to do is output plain text in an HTML context (and you do), then you need to use escaping functionality that will ensure that you will always output plain text in an HTML context. Given basic PHP, that would indeed be htmlspecialchars($yourString, ENT_QUOTES, 'yourEncoding');.

    If what you want to do is output HTML in an HTML context (you don't), then you would want to santitise the HTML when you output it to prevent it from doing damage - here you would $purifier->purify($yourString); on output.

    If you want to store plain text user input in a database (again, you do) by executing SQL statements, then you should either use prepared statements to prevent SQL injection, or an escaping function specific to your DB, such as mysql_real_escape_string($yourString).

    You should not:

    • escape for HTML when you are putting data into the database
    • sanitise as HTML when you are putting data into the database
    • sanitise as HTML when you are outputting data as plain text

    Of those, all are outright harmful, albeit to different degrees. Note that the following assumes the database is your only or canonical storage medium for the data (it also assumes you have SQL injection taken care of in some other way - if you don't, that'll be your primary issue):

    • if you escape for HTML when you put the data into the database, you rely on the guarantee that you will always be outputting the data into an HTML context; suddenly if you want to just put it into a plaintext file for printing as-is, you need to decode the data before you output it.
    • if you sanitise as HTML when you put the data into the database, you are destroying information that your user put there. Is it a messaging system and your user wanted to tell someone else about
提交回复
热议问题