发表新帖

发表新帖

How to match content between HTML specific tags with attribute using grep?

前端未结

关注

 3  1022

Which regular expression should I use with the command grep if I wanted to match the text contained within the tag

and it

相关标签:

3条回答

野性不改

2020-12-31 07:48
You can't do it reliably with just grep. You need to parse the HTML with an HTML parser.

What if the HTML code has something like:
```

```
You'll get a false hit on that commented-out code. Here are some other examples where a regex-only option will fail you.

Consider using xmlgrep from the XML::Grep Perl module, as discussed here: Extract Title of a html file using grep
0 讨论(0)
发布评论:

提交评论
- 加载中...
清酒与你

2020-12-31 07:54
You can do that by specifying a regex:
```
grep -E "^<div class=\"Message\">.*</div>$" input_files
```
Not that this will only print the enclosures found on the same line. If your tag spans multiple lines, you can try:
```
tr '\n' ' ' < input_file | grep -E "^<div class=\"Message\">.*</div>$"
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2020-12-31 07:59
Here's one way using GNU grep:
```
grep -oP '(?<=<div class="Message"> ).*?(?= </div>)' file
```
If your tags span multiple lines, try:
```
< file tr -d '\n' | grep -oP '(?<=<div class="Message"> ).*?(?= </div>)'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题