how to use sed, awk, or gawk to print only what is matched?

后端未结

关注

 11  505

长情又很酷

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.

But in my case, I have a regular expression that I want to run

相关标签:

11条回答

既然无缘

2021-01-30 06:31
I use perl to make this easier for myself. e.g.
```
perl -ne 'print $1 if /.*abc([0-9]+)xyz.*/'
```
This runs Perl, the -n option instructs Perl to read in one line at a time from STDIN and execute the code. The -e option specifies the instruction to run.

The instruction runs a regexp on the line read, and if it matches prints out the contents of the first set of bracks ($1).

You can do this will multiple file names on the end also. e.g.

perl -ne 'print $1 if /.*abc([0-9]+)xyz.*/' example1.txt example2.txt
0 讨论(0)
发布评论:

提交评论
- 加载中...
灰色年华

2021-01-30 06:32
```
gawk '/.*abc([0-9]+)xyz.*/' file
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
盖世英雄少女心

2021-01-30 06:35
If you want to select lines then strip out the bits you don't want:
```
egrep 'abc[0-9]+xyz' inputFile | sed -e 's/^.*abc//' -e 's/xyz.*$//'
```
It basically selects the lines you want with egrep and then uses sed to strip off the bits before and after the number.

You can see this in action here:
```
pax> echo 'a
b
c
abc12345xyz
a
b
c' | egrep 'abc[0-9]+xyz' | sed -e 's/^.*abc//' -e 's/xyz.*$//'
12345
pax> 
```
Update: obviously if you actual situation is more complex, the REs will need to me modified. For example if you always had a single number buried within zero or more non-numerics at the start and end:
```
egrep '[^0-9]*[0-9]+[^0-9]*$' inputFile | sed -e 's/^[^0-9]*//' -e 's/[^0-9]*$//'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

傲寒

2021-01-30 06:37

For awk. I would use the following script:

/.*abc([0-9]+)xyz.*/ {
            print $0;
            next;
            }
            {
            /* default, do nothing */
            }

0 讨论(0)

清酒与你

2021-01-30 06:38

you can do it with the shell

while read -r line
do
    case "$line" in
        *abc*[0-9]*xyz* ) 
            t="${line##abc}"
            echo "num is ${t%%xyz}";;
    esac
done <"file"

0 讨论(0)

忘掉有多难

2021-01-30 06:40
The OP's case doesn't specify that there can be multiple matches on a single line, but for the Google traffic, I'll add an example for that too.

Since the OP's need is to extract a group from a pattern, using grep -o will require 2 passes. But, I still find this the most intuitive way to get the job done.
```
$ cat > example.txt <<TXT
a
b
c
abc12345xyz
a
abc23451xyz asdf abc34512xyz
c
TXT

$ cat example.txt | grep -oE 'abc([0-9]+)xyz'
abc12345xyz
abc23451xyz
abc34512xyz

$ cat example.txt | grep -oE 'abc([0-9]+)xyz' | grep -oE '[0-9]+'
12345
23451
34512
```
Since processor time is basically free but human readability is priceless, I tend to refactor my code based on the question, "a year from now, what am I going to think this does?" In fact, for code that I intend to share publicly or with my team, I'll even open man grep to figure out what the long options are and substitute those. Like so: grep --only-matching --extended-regexp
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页