I searched for a solution but nothing was relevant, so here is my problem:
I want to parse a string which contains HTML text. I want to do it in JavaScript.
The fastest way to parse HTML in Chrome and Firefox is Range#createContextualFragment:
var range = document.createRange();
range.selectNode(document.body); // required in Safari
var fragment = range.createContextualFragment('<h1>html...</h1>');
var firstNode = fragment.firstChild;
I would recommend to create a helper function which uses createContextualFragment if available and falls back to innerHTML otherwise.
Benchmark: http://jsperf.com/domparser-vs-createelement-innerhtml/3
let content = "<center><h1>404 Not Found</h1></center>"
let result = $("<div/>").html(content).text()
content: <center><h1>404 Not Found</h1></center>
,
result: "404 Not Found"
var doc = new DOMParser().parseFromString(html, "text/html");
var links = doc.querySelectorAll("a");
EDIT: The solution below is only for HTML "fragments" since html,head and body are removed. I guess the solution for this question is DOMParser's parseFromString() method.
For HTML fragments, the solutions listed here works for most HTML, however for certain cases it won't work.
For example try parsing <td>Test</td>
. This one won't work on the div.innerHTML solution nor DOMParser.prototype.parseFromString nor range.createContextualFragment solution. The td tag goes missing and only the text remains.
Only jQuery handles that case well.
So the future solution (MS Edge 13+) is to use template tag:
function parseHTML(html) {
var t = document.createElement('template');
t.innerHTML = html;
return t.content.cloneNode(true);
}
var documentFragment = parseHTML('<td>Test</td>');
For older browsers I have extracted jQuery's parseHTML() method into an independent gist - https://gist.github.com/Munawwar/6e6362dbdf77c7865a99