replace all but certain html tags with htmlspecialchars() in PHP?

后端 未结 1 937
清歌不尽
清歌不尽 2021-01-15 12:41

I would like to process my user input to allow only certain html tags, and replace the other ones by their html entities, as well as replace non-tag-characters. For example,

相关标签:
1条回答
  • 2021-01-15 13:06

    Apply htmlspecialchars and then replace encoded entities with regular entities for a given array of tags

    function allow_only($str, $allowed){
        $str = htmlspecialchars($str);
        foreach( $allowed as $a ){
            $str = str_replace("&lt;".$a."&gt;", "<".$a.">", $str);
            $str = str_replace("&lt;/".$a."&gt;", "</".$a.">", $str);
        }
        return $str;
    }
    echo allow_only("This is <b>bold</b> and this is <i>italic</i>.", array("b"));
    

    That works for simple tags, returning "This is bold and this is <i>italic</i>."

    As it was pointed out, that doesn't work for tags with attributes, but this does:

    function fix_attributes($match){
        return "<".$match[1].str_replace('&quot;','"',$match[2]).">";
    }
    function allow_only($str, $allowed){
        $str = htmlspecialchars($str);
        foreach( $allowed as $a ){
            $str = preg_replace_callback("/&lt;(".$a."){1}([\s\/\.\w=&;:#]*?)&gt;/", fix_attributes, $str);
            $str = str_replace("&lt;/".$a."&gt;", "</".$a.">", $str);
        }
        return $str;
    }
    echo allow_only('This is <b>bold</b> and <a href="http://www.#links">this</a> is <i>italic</i>.', array("b","a"));
    

    that handles more complex tags with certain attributes, only the characters listed between [] are allowed to appear in attributes by this. Unfortunately &quot; must be allowed within attributes or it won't work, and with it all other entities are allowed too - however only &quot in attributes will be decoded.

    As it was suggested a much better (safer, cleaner) way to solve problems like this to use a library like http://htmlpurifier.org/demo.php

    0 讨论(0)
提交回复
热议问题