Find/Replace regex to remove html tags

后端 未结 5 761
别那么骄傲
别那么骄傲 2021-02-01 22:17

Using find and replace, what regex would remove the tags surrounding something like this:

相关标签:
5条回答
  • 2021-02-01 23:03
    String s = "<option value=\"863\">Viticulture and Enology</option>";
    s.replaceAll ("(<option value=\"[0-9]+\">)([^<]+)</option>", "$2")
    res1: java.lang.String = Viticulture and Enology
    

    (Tested with scala, therefore the res1:)

    With sed, you would use a little different syntax:

    echo '<option value="863">Viticulture and Enology</option>'|sed -re 's|(<option value="[0-9]+">)([^<]+)</option>|\2|'
    

    For notepad++, I don't know the details, but "[0-9]+" should mean 'at least one digit', "[^<]" anything but a opening less-than, multiple times. Masking and backreferences may differ. Regexes are problematic, if they span multiple lines, or are hidden by a comment, a regex will not recognize it.

    However, a lot of html is genereated in a regex-friendly way, always fitting into a line, and never commented out. Or you use it in throwaway code, and can check your input before.

    0 讨论(0)
  • 2021-02-01 23:06

    This works for me Notepad++ 5.8.6 (UNICODE)

    search : <option value="\d+">(.*?)</option>

    replace : $1

    Be sure to select "Regular expression" and ". matches newline" enter image description here

    0 讨论(0)
  • 2021-02-01 23:07

    This works perfectly for me:

    • Select "Regular Expression" in "Find" Mode.
    • Enter [<].*?> in "Find What" field and leave the "Replace With" field empty.
    • Note that you need to have version 5.9 of Notepad++ for the ? operator to work.

    as found here: digoCOdigo - strip html tags in notepad++

    0 讨论(0)
  • 2021-02-01 23:17

    I have done by using following regular expression:

    Find this : <.*?>|</.*?>

    and

    replace with : \r\n (this for new line)

    By using this regular expression (<.*?>|</.*?>) we can easily find value between your HTML tags like below:

    I have input:

    <otpion value="123">1</option><otpion value="1234">2</option><otpion value="1235">3</option><otpion value="1236">4</option><otpion value="1237">5</option> 
    

    I need to find values between options like 1,2,3,4,5

    and got below output :

    0 讨论(0)
  • 2021-02-01 23:22

    Something like this would work (as long as you know the format of the HTML won't change):

    <option value="(\d+)">(.+)</option>
    
    0 讨论(0)
提交回复
热议问题