Apparently, this is harder to find than I thought it would be. And it even is so simple...
Is there a function equivalent to PHP\'s htmlspecialchars built into Javas
function htmlEscape(str){
return str.replace(/[&<>'"]/g,x=>'&#'+x.charCodeAt(0)+';')
}
This solution uses the numerical code of the characters, for example <
is replaced by <
.
Although its performance is slightly worse than the solution using a map, it has the advantages:
There is a problem with your solution code--it will only escape the first occurrence of each special character. For example:
escapeHtml('Kip\'s <b>evil</b> "test" code\'s here');
Actual: Kip's <b>evil</b> "test" code's here
Expected: Kip's <b>evil</b> "test" code's here
Here is code that works properly:
function escapeHtml(text) {
return text
.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}
The following code will produce identical results to the above, but it performs better, particularly on large blocks of text (thanks jbo5112).
function escapeHtml(text) {
var map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return text.replace(/[&<>"']/g, function(m) { return map[m]; });
}
For Node.JS users (or users utilizing Jade runtime in the browser), you can use Jade's escape function.
require('jade').runtime.escape(...);
No sense in writing it yourself if someone else is maintaining it. :)
Yet another take at this is to forgo all the character mapping altogether and to instead convert all unwanted characters into their respective numeric character references, e.g.:
function escapeHtml(raw) {
return raw.replace(/[&<>"']/g, function onReplace(match) {
return '&#' + match.charCodeAt(0) + ';';
});
}
Note that the specified RegEx only handles the specific characters that the OP wanted to escape but, depending on the context that the escaped HTML is going to be used, these characters may not be sufficient. Ryan Grove’s article There's more to HTML escaping than &, <, >, and " is a good read on the topic. And depending on your context, the following RegEx may very well be needed in order to avoid XSS injection:
var regex = /[&<>"'` !@$%()=+{}[\]]/g
String.prototype.escapeHTML = function() {
return this.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}
sample :
var toto = "test<br>";
alert(toto.escapeHTML());
I am elaborating a bit on o.k.w.'s answer.
You can use the browser's DOM functions for that.
var utils = {
dummy: document.createElement('div'),
escapeHTML: function(s) {
this.dummy.textContent = s
return this.dummy.innerHTML
}
}
utils.escapeHTML('<escapeThis>&')
This returns <escapeThis>&
It uses the standard function createElement
to create an invisible element, then uses the function textContent
to set any string as its content and then innerHTML
to get the content in its HTML representation.