replace all but certain html tags with htmlspecialchars() in PHP?

后端 未结 1 938
清歌不尽
清歌不尽 2021-01-15 12:41

I would like to process my user input to allow only certain html tags, and replace the other ones by their html entities, as well as replace non-tag-characters. For example,

1条回答
  •  时光说笑
    2021-01-15 13:06

    Apply htmlspecialchars and then replace encoded entities with regular entities for a given array of tags

    function allow_only($str, $allowed){
        $str = htmlspecialchars($str);
        foreach( $allowed as $a ){
            $str = str_replace("<".$a.">", "<".$a.">", $str);
            $str = str_replace("</".$a.">", "", $str);
        }
        return $str;
    }
    echo allow_only("This is bold and this is italic.", array("b"));
    

    That works for simple tags, returning "This is bold and this is italic."

    As it was pointed out, that doesn't work for tags with attributes, but this does:

    function fix_attributes($match){
        return "<".$match[1].str_replace('"','"',$match[2]).">";
    }
    function allow_only($str, $allowed){
        $str = htmlspecialchars($str);
        foreach( $allowed as $a ){
            $str = preg_replace_callback("/<(".$a."){1}([\s\/\.\w=&;:#]*?)>/", fix_attributes, $str);
            $str = str_replace("</".$a.">", "", $str);
        }
        return $str;
    }
    echo allow_only('This is bold and this is italic.', array("b","a"));
    

    that handles more complex tags with certain attributes, only the characters listed between [] are allowed to appear in attributes by this. Unfortunately " must be allowed within attributes or it won't work, and with it all other entities are allowed too - however only " in attributes will be decoded.

    As it was suggested a much better (safer, cleaner) way to solve problems like this to use a library like http://htmlpurifier.org/demo.php

    0 讨论(0)
提交回复
热议问题