Apparently, this is harder to find than I thought it would be. And it even is so simple...
Is there a function equivalent to PHP\'s htmlspecialchars built into Javas
Here's a function to escape HTML:
function escapeHtml(str)
{
var map =
{
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return str.replace(/[&<>"']/g, function(m) {return map[m];});
}
And to decode:
function decodeHtml(str)
{
var map =
{
'&': '&',
'<': '<',
'>': '>',
'"': '"',
''': "'"
};
return str.replace(/&|<|>|"|'/g, function(m) {return map[m];});
}
Chances are you don't need such a function. Since your code is already in the browser*, you can access the DOM directly instead of generating and encoding HTML that will have to be decoded backwards by the browser to be actually used.
Use innerText
property to insert plain text into the DOM safely and much faster than using any of the presented escape functions. Even faster than assigning a static preencoded string to innerHTML
.
Use classList
to edit classes, dataset
to set data-
attributes and setAttribute
for others.
All of these will handle escaping for you. More precisely, no escaping is needed and no encoding will be performed underneath**, since you are working around HTML, the textual representation of DOM.
// use existing element
var author = 'John "Superman" Doe <john@example.com>';
var el = document.getElementById('first');
el.dataset.author = author;
el.textContent = 'Author: '+author;
// or create a new element
var a = document.createElement('a');
a.classList.add('important');
a.href = '/search?q=term+"exact"&n=50';
a.textContent = 'Search for "exact" term';
document.body.appendChild(a);
// actual HTML code
console.log(el.outerHTML);
console.log(a.outerHTML);
.important { color: red; }
<div id="first"></div>
* This answer is not intended for server-side JavaScript users (Node.js, etc.)
** Unless you explicitly convert it to actual HTML afterwards. E.g. by accessing innerHTML
- this is what happens when you run $('<div/>').text(value).html();
suggested in other answers. So if your final goal is to insert some data into the document, by doing it this way you'll be doing the work twice. Also you can see that in the resulting HTML not everything is encoded, only the minimum that is needed for it to be valid. It is done context-dependently, that's why this jQuery method doesn't encode quotes and therefore should not be used as a general purpose escaper. Quotes escaping is needed when you're constructing HTML as a string with untrusted or quote-containing data at the place of an attribute's value. If you use the DOM API, you don't have to care about escaping at all.
That's HTML Encoding. There's no native javascript function to do that, but you can google and get some nicely done up ones.
E.g. http://sanzon.wordpress.com/2008/05/01/neat-little-html-encoding-trick-in-javascript/
EDIT:
This is what I've tested:
var div = document.createElement('div');
var text = document.createTextNode('<htmltag/>');
div.appendChild(text);
console.log(div.innerHTML);
Output: <htmltag/>
Reversed one:
function decodeHtml(text) {
return text
.replace(/&/g, '&')
.replace(/</ , '<')
.replace(/>/, '>')
.replace(/"/g,'"')
.replace(/'/g,"'");
}
function htmlspecialchars(str) {
if (typeof(str) == "string") {
str = str.replace(/&/g, "&"); /* must do & first */
str = str.replace(/"/g, """);
str = str.replace(/'/g, "'");
str = str.replace(/</g, "<");
str = str.replace(/>/g, ">");
}
return str;
}
Worth a read: http://bigdingus.com/2007/12/29/html-escaping-in-javascript/
escapeHTML: (function() {
var MAP = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
var repl = function(c) { return MAP[c]; };
return function(s) {
return s.replace(/[&<>'"]/g, repl);
};
})()
Note: Only run this once. And don't run it on already encoded strings e.g. &
becomes &amp;