How to match content between HTML specific tags with attribute using grep?

前端 未结 3 1022
时光说笑
时光说笑 2020-12-31 07:43

Which regular expression should I use with the command grep if I wanted to match the text contained within the tag

and it
相关标签:
3条回答
  • 2020-12-31 07:48

    You can't do it reliably with just grep. You need to parse the HTML with an HTML parser.

    What if the HTML code has something like:

    <!--
    <div class="Message">blah blah</div>
    -->
    

    You'll get a false hit on that commented-out code. Here are some other examples where a regex-only option will fail you.

    Consider using xmlgrep from the XML::Grep Perl module, as discussed here: Extract Title of a html file using grep

    0 讨论(0)
  • 2020-12-31 07:54

    You can do that by specifying a regex:

    grep -E "^<div class=\"Message\">.*</div>$" input_files
    

    Not that this will only print the enclosures found on the same line. If your tag spans multiple lines, you can try:

    tr '\n' ' ' < input_file | grep -E "^<div class=\"Message\">.*</div>$"
    
    0 讨论(0)
  • 2020-12-31 07:59

    Here's one way using GNU grep:

    grep -oP '(?<=<div class="Message"> ).*?(?= </div>)' file
    

    If your tags span multiple lines, try:

    < file tr -d '\n' | grep -oP '(?<=<div class="Message"> ).*?(?= </div>)'
    
    0 讨论(0)
提交回复
热议问题