Why is filter_input() incomplete?

后端 未结 3 1417
半阙折子戏
半阙折子戏 2021-01-02 01:18

I am working a lot on a PHP-based CMS at the moment, and while I\'m at it I would like to move all the handling and sanitation of user input to one central place. (At the mo

相关标签:
3条回答
  • 2021-01-02 01:41

    "input filtering" or "sanitation" is an absurd idea. Stay away from it.

    Explanations and further discussion

    What's the best method for sanitizing user input with PHP?

    What else should I be doing to sanitize user input?

    0 讨论(0)
  • 2021-01-02 01:41

    In programming, you must be as restrictive on your input as possible. That goes for data sources as well. $_REQUEST contains everything in $_GET, $_POST and $_COOKIE, which may lead to problems.

    Think for example what happens if a plugin of your CMS introduces a new special key in one of them, which happens to exist as a meaningful key in another plugin?

    So DON'T ever use $_REQUEST. Use $_GET, $_POST or $_COOKIE, whichever fits your scenario. It's a good practice to be as strict as possible, and that has nothing to do with PHP, but with programming in general.

    0 讨论(0)
  • 2021-01-02 02:04

    I would like to move all the handling and sanitation of user input to one central place

    Yes, how lovely that would be. It can't be done. That's not how text processing works.

    If you're inserting text from one context into another you need to use the right escapes. (mysql_real_escape_string for MySQL string literals, htmlspecialchars for HTML content, urlencode for URL parameters, others for specific contexts). At the start of your script when you're filtering, you don't know where your input is going to end up, so you don't know how to escape it.

    Maybe one input string is going both into the database (needs to be SQL-escaped) and directly onto the page (needs to be HTML-escaped). There's no one escape that covers both those cases. You can use both escapes one after the other, but then the value in the HTML will have weird backslashes appearing in it and the copy in the database will be full of ampersands. A few rounds of this misencoding and you get that situation where every time you edit something, long strings of \\\\\\\\\\\\\\\\\\\\ and & come out.

    The only way you can safely filter in one go at start time is by completely removing all characters that need to be escaped in any of the contexts you're going to be using them in. But that means no apostrophes or backslashes in your HTML, no ampersands or less-thans in your database, and probably a whole load of other URL-unfriendly punctuation has to go too. For a simple site that doesn't take arbitrary text you could maybe get away with that. But usually not.

    So you can only escape on the fly when one type of text goes into another. The best strategy to avoid the problem is to avoid concatenating text into other contexts as much as much as you possibly can, for example by using parameterised queries instead of SQL string building, and either defining an echo(htmlspecialchars()) function with a nice short name to make it less work to type, or using an alternative templating system that HTML-escapes by default.

    0 讨论(0)
提交回复
热议问题