Using regular expressions to find img tags without an alt attribute

前端 未结 8 1734
伪装坚强ぢ
伪装坚强ぢ 2021-01-30 11:37

I am going through a large website (1600+ pages) to make it pass Priority 1 W3C WAI. As a result, things like image tags need to have alt attributes.

What would be the

相关标签:
8条回答
  • 2021-01-30 12:14

    This is perfectly possible with following regEx:

    <img([^a]|a[^l]|al[^t]|alt[^=])*?/>
    

    Looking for something that isn't there, is rather tricky, but we can trick them back, by looking for a group that doesn't start with 'a', or an 'a' that doesn't get followed by an 'l' and so on.

    0 讨论(0)
  • 2021-01-30 12:17

    Here is what I just tried in my own environment with a massive enterprise code base with some good success (found no false positives but definitely found valid cases):

    <img(?![^>]*\balt=)[^>]*?>
    

    What's going on in this search:

    1. find the opening of the tag
    2. look for the absence of zero or more characters that are not the closing bracket while also …
    3. Checking for the absence of of a word that begins with "alt" ("\b" is there for making sure we don't get a mid-word name match on something like a class value) and is followed by "=", then …
    4. look for zero or more characters that are not the closing bracket
    5. find the closing bracket

    So this will match:

    <img src="foo.jpg" class="baltic" />
    

    But it won't match either of these:

    <img src="foo.jpg" class="baltic" alt="" />
    <img src="foo.jpg" alt="I have a value.">
    
    0 讨论(0)
提交回复
热议问题