Use awk to find first occurrence only of string after a delimiter

问题

I have a bunch of documents that all have the line, Account number: 123456789 in various locations.

What I need to do is be able to parse through the files, and find the account number itself. So, awk needs to look for Account number: and return the string immediately following.

For example, if it was:

Account number: 1234567

awk should return:

Once it's found the first occurrence it can stop looking.

But, I'm stumped. What's the right way to do this using awk?

回答1:

One way:

awk -F: '$1=="Account number"{print $2;exit;}' file

I assume you want to stop the moment you find the first occurence in the file. If you want to find occurrences in every line of the file, just remove the exit .

回答2:

You can use an if to check if $1 and $2 equal "Account" and "number:". If they do, then print $3:

> awk '{if ($1 == "Account" && $2 == "number:") {print $3; exit;}}' input.txt

回答3:

For such matchings I prefer using grep with look-behind:

grep -Po '(?<=Account number: )\d+' file

grep -Po 'Account number: \K\d+' file

This says: print whatever sequence of digits (\d+) appearing after the string Account number:.

In the secondcase, \K clears the matched string, so that it starts printing after such \K.

See it in action given a file file:

Account number: 1234567
but then another Account number: 789
and that's all

Let's see how the output looks like:

$ grep -Po '(?<=Account number: )\d+' file
1234567
789

回答4:

You could also use sed -n s///p:

sed -En 's/^Account number: (.+)/\1/p' *.txt | head -n1

回答5:

The accepted answer outputs a space in front of the string which forced me to use another approach:

awk '/Account number/{print $3; exit}'

This solution ignores the : separator but works like a charm and is a bit easier to remember IMO.

来源：https://stackoverflow.com/questions/15331259/use-awk-to-find-first-occurrence-only-of-string-after-a-delimiter

标签

bash

awk