I wrote a php script to fetch the email content.
These contents are HTML format.
I\'d like to display the content, as below
You should use strip_tags() function and allow only tags that you want user to add.
echo strip_tags($text, '<p><a>');
This line allows <p>
and <a>
tags every other tag will be removed.
htmlspecialchars() works totally different.
From manual:
The translations performed are:
'&' (ampersand) becomes '&'
'"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
"'" (single quote) becomes ''' (or ') only when ENT_QUOTES is set.
'<' (less than) becomes '<'
'>' (greater than) becomes '>'
There is very nice article about XSS prevention and CSRF prenvetion read it.
HTMLPurifer can do that:
require_once '/path/to/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);
It takes dirty HTML (ie possibly containing Javascript) and removes any script.
PHP doesn't have anything native or built in that can remove Javacript like HTMLPurifier. You could use DOMDocument but this would be a lengthy task because Javascript can execute in some attributes (onerror, onclick) and is not just limited to <script></script>
.