How to print matched regex pattern using awk?

后端未结

关注

 8  639

说谎

Using awk, I need to find a word in a file that matches a regex pattern.

I only want to print the word matched with the pattern.

So if

相关标签:

8条回答

时光说笑

2020-11-29 16:31
gawk can get the matching part of every line using this as action:
```
{ if (match($0,/your regexp/,m)) print m[0] }
```
match(string, regexp [, array]) If array is present, it is cleared, and then the zeroth element of array is set to the entire portion of string matched by regexp. If regexp contains parentheses, the integer-indexed elements of array are set to contain the portion of string matching the corresponding parenthesized subexpression. http://www.gnu.org/software/gawk/manual/gawk.html#String-Functions
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2020-11-29 16:42
If Perl is an option, you can try this:
```
perl -lne 'print $1 if /(regex)/' file
```
To implement case-insensitive matching, add the i modifier
```
perl -lne 'print $1 if /(regex)/i' file
```
To print everything AFTER the match:
```
perl -lne 'if ($found){print} else{if (/regex(.*)/){print $1; $found++}}' textfile
```
To print the match and everything after the match:
```
perl -lne 'if ($found){print} else{if (/(regex.*)/){print $1; $found++}}' textfile
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
滥情空心

2020-11-29 16:43
If you are only interested in the last line of input and you expect to find only one match (for example a part of the summary line of a shell command), you can also try this very compact code, adopted from How to print regexp matches using `awk`?:
```
$ echo "xxx yyy zzz" | awk '{match($0,"yyy",a)}END{print a[0]}'
yyy
```
Or the more complex version with a partial result:
```
$ echo "xxx=a yyy=b zzz=c" | awk '{match($0,"yyy=([^ ]+)",a)}END{print a[1]}'
b
```
Warning: the awk match() function with three arguments only exists in gawk, not in mawk

Here is another nice solution using a lookbehind regex in grep instead of awk. This solution has lower requirements to your installation:
```
$ echo "xxx=a yyy=b zzz=c" | grep -Po '(?<=yyy=)[^ ]+'
b
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
忘掉有多难

2020-11-29 16:44
It sounds like you are trying to emulate GNU's grep -o behaviour. This will do that providing you only want the first match on each line:
```
awk 'match($0, /regex/) {
    print substr($0, RSTART, RLENGTH)
}
' file
```
Here's an example, using GNU's awk implementation (gawk):
```
awk 'match($0, /a.t/) {
    print substr($0, RSTART, RLENGTH)
}
' /usr/share/dict/words | head
act
act
act
act
aft
ant
apt
art
art
art
```
Read about match, substr, RSTART and RLENGTH in the awk manual.

After that you may wish to extend this to deal with multiple matches on the same line.
0 讨论(0)
发布评论:

提交评论
- 加载中...
清酒与你

2020-11-29 16:46
This is the very basic
```
awk '/pattern/{ print $0 }' file
```
ask awk to search for pattern using //, then print out the line, which by default is called a record, denoted by $0. At least read up the documentation.

If you only want to get print out the matched word.
```
awk '{for(i=1;i<=NF;i++){ if($i=="yyy"){print $i} } }' file
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
深忆病人

2020-11-29 16:47
Using sed can also be elegant in this situation. Example (replace line with matched group "yyy" from line):
```
$ cat testfile
xxx yyy zzz
yyy xxx zzz
$ cat testfile | sed -r 's#^.*(yyy).*$#\1#g'
yyy
yyy
```
Relevant manual page: https://www.gnu.org/software/sed/manual/sed.html#Back_002dreferences-and-Subexpressions
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页