Regular expression to get a class name from html

前端 未结 5 1087
温柔的废话
温柔的废话 2021-01-13 09:39

I know my question might look like a duplication for this question, but its not
I am trying to match a class name inside html text that comes from the

相关标签:
5条回答
  • 2021-01-13 10:15

    Regular expressions are not a good fit for parsing HTML. HTML is not regular.

    jQuery can be a very good fit here.

    var html = 'Your HTML here...';
    
    $('<div>' + html + '</div>').find('[class~="b"]').each(function () {
        console.log(this);
    });
    

    The selector [class~="b"] will select any element that has a class attribute containing the word b. The initial HTML is wrapped inside a div to make the find method work properly.

    0 讨论(0)
  • 2021-01-13 10:16

    This may not be a solution for you, but if you aren't set on using a full regex match, you could do (assuming your examples are representative of the data you will be parsing) :

    function hasTheClass(html_string, classname) {
        //!!~ turns -1 into false, and anything else into true. 
        return !!~html_string.split("=")[1].split(/[\'\"]/)[1].split(" ").indexOf(classname);
    }
    
    hasTheClass("<div class='a b c d'></div>", 'b'); //returns true
    
    0 讨论(0)
  • 2021-01-13 10:25

    Test it here: https://regex101.com/r/vnOFjm/1

    regexp: (?:class|className)=(?:["']\W+\s*(?:\w+)\()?["']([^'"]+)['"]

    const regex = /(?:class|className)=(?:["']\W+\s*(?:\w+)\()?["']([^'"]+)['"]/gmi;
    const str = `<div id="content" class="container">
    
    <div style="overflow:hidden;margin-top:30px">
      <div style="width:300px;height:250px;float:left">
    <ins class="adsbygoogle turbo" style="display:inline-block !important;width:300px;min-height:250px; display: none !important;" data-ad-client="ca-pub-1904398025977193" data-ad-slot="4723729075" data-color-link="2244BB" qgdsrhu="" hidden=""></ins>
    
    
    <img src="http://static.teleman.pl/images/pixel.gif?show,753804,20160812" alt="" width="0" height="0" hidden="" style="display: none !important;">
    </div>`;
    
    let m;
    
    while ((m = regex.exec(str)) !== null) {
        // This is necessary to avoid infinite loops with zero-width matches
        if (m.index === regex.lastIndex) {
            regex.lastIndex++;
        }
        
        // The result can be accessed through the `m`-variable.
        m.forEach((match, groupIndex) => {
            console.log(`Found match, group ${groupIndex}: ${match}`);
        });
    }

    0 讨论(0)
  • 2021-01-13 10:28

    Use the browser to your advantage:

    var str = '<div class=\'a b c d\'></div>\
    <!-- or -->\
    <div class="a b c d"></div>\
    <!-- There might be spaces after and before the = (the equal sign) -->';
    
    var wrapper = document.createElement('div');
    wrapper.innerHTML = str;
    
    var elements = wrapper.getElementsByClassName('b');
    
    if (elements.length) {
        // there are elements with class b
    }
    

    Demo

    Btw, getElementsByClassName() is not very well supported in IE until version 9; check this answer for an alternative.

    0 讨论(0)
  • 2021-01-13 10:32

    Using a regex, this pattern should work for you:

    var r = new RegExp("(<\\w+?\\s+?class\\s*=\\s*['\"][^'\"]*?\\b)" + key + "\\b", "i");
    #                   Λ                                         Λ                  Λ
    #                   |_________________________________________|                  |
    #                           ____________|                                        |
    # [Creating a backreference]                                                     |
    # [which will be accessible]  [Using "i" makes the matching "case-insensitive".]_|
    # [using $1 (see examples).]  [You can omit "i" for case-sensitive matching.   ]
    

    E.g.

    var oldClass = "b";
    var newClass = "e";
    var r = new RegExp("..." + oldClass + "...");
    
    "<div class='a b c d'></div>".replace(r, "$1" + newClass);
        // ^-- returns: <div class='a e c d'></div>
    "<div class=\"a b c d\"></div>".replace(r, "$1" + newClass);
        // ^-- returns: <div class="a e c d"></div>    
    "<div class='abcd'></div>".replace(r, "$1" + newClass);
        // ^-- returns: <div class='abcd'></div>     // <-- NO change
    

    NOTE:
    For the above regex to work there must be no ' or " inside the class string.
    I.e. <div class="a 'b' c d"... will NOT match.

    0 讨论(0)
提交回复
热议问题