Parse an HTML string with JS

后端 未结 10 1264
逝去的感伤
逝去的感伤 2020-11-21 07:17

I searched for a solution but nothing was relevant, so here is my problem:

I want to parse a string which contains HTML text. I want to do it in JavaScript.

相关标签:
10条回答
  • 2020-11-21 07:34

    The fastest way to parse HTML in Chrome and Firefox is Range#createContextualFragment:

    var range = document.createRange();
    range.selectNode(document.body); // required in Safari
    var fragment = range.createContextualFragment('<h1>html...</h1>');
    var firstNode = fragment.firstChild;
    

    I would recommend to create a helper function which uses createContextualFragment if available and falls back to innerHTML otherwise.

    Benchmark: http://jsperf.com/domparser-vs-createelement-innerhtml/3

    0 讨论(0)
  • 2020-11-21 07:34
    let content = "<center><h1>404 Not Found</h1></center>"
    let result = $("<div/>").html(content).text()
    

    content: <center><h1>404 Not Found</h1></center>,
    result: "404 Not Found"

    0 讨论(0)
  • 2020-11-21 07:46
    var doc = new DOMParser().parseFromString(html, "text/html");
    var links = doc.querySelectorAll("a");
    
    0 讨论(0)
  • 2020-11-21 07:47

    EDIT: The solution below is only for HTML "fragments" since html,head and body are removed. I guess the solution for this question is DOMParser's parseFromString() method.


    For HTML fragments, the solutions listed here works for most HTML, however for certain cases it won't work.

    For example try parsing <td>Test</td>. This one won't work on the div.innerHTML solution nor DOMParser.prototype.parseFromString nor range.createContextualFragment solution. The td tag goes missing and only the text remains.

    Only jQuery handles that case well.

    So the future solution (MS Edge 13+) is to use template tag:

    function parseHTML(html) {
        var t = document.createElement('template');
        t.innerHTML = html;
        return t.content.cloneNode(true);
    }
    
    var documentFragment = parseHTML('<td>Test</td>');
    

    For older browsers I have extracted jQuery's parseHTML() method into an independent gist - https://gist.github.com/Munawwar/6e6362dbdf77c7865a99

    0 讨论(0)
提交回复
热议问题