Which regular expression should I use with the command grep if I wanted to match the text contained within the tag
You can't do it reliably with just grep. You need to parse the HTML with an HTML parser.
What if the HTML code has something like:
<!--
<div class="Message">blah blah</div>
-->
You'll get a false hit on that commented-out code. Here are some other examples where a regex-only option will fail you.
Consider using xmlgrep from the XML::Grep
Perl module, as discussed here: Extract Title of a html file using grep
You can do that by specifying a regex:
grep -E "^<div class=\"Message\">.*</div>$" input_files
Not that this will only print the enclosures found on the same line. If your tag spans multiple lines, you can try:
tr '\n' ' ' < input_file | grep -E "^<div class=\"Message\">.*</div>$"
Here's one way using GNU grep
:
grep -oP '(?<=<div class="Message"> ).*?(?= </div>)' file
If your tags span multiple lines, try:
< file tr -d '\n' | grep -oP '(?<=<div class="Message"> ).*?(?= </div>)'