How to pipe tail -f into awk

问题

I'm trying to set up a script where an alert is generated when a certain string appears in a log file.

The solution already in place greps the whole log file once a minute and counts how often the string appears, using the log line's timestamp to count only occurrences in the previous minute.

I figured it would be much more efficient to do this with a tail, so I tried the following, as a test:

FILENAME="/var/log/file.log"

tail -f $FILENAME | awk -F , -v var="$HOSTNAME" '
                BEGIN {
                        failed_count=0;
                }
                /account failure reason/ {
                        failed_count++;
                }
                END {
                        printf("%saccount failure reason (Errors per Interval)=%d\n", var, failed_count);
                }
'

but this just hangs and doesn't output anything. Somebody suggested this minor change:

FILENAME="/var/log/file.log"

awk -F , -v var="$HOSTNAME" '
                BEGIN {
                        failed_count=0;
                }
                /account failure reason/ {
                        failed_count++;
                }
                END {
                        printf("%saccount failure reason (Errors per Interval)=%d\n", var, failed_count);
                }
' <(tail -f $FILENAME)

but that does the same thing.

The awk I'm using (I've simplified in the code above) works, as it's used in the existing script where the results of grep "^$TIMESTAMP" are piped into it.

My question is, how can get the tail -f to work with awk?

回答1:

Assuming your log looks something like this:

Jul 13 06:43:18 foo account failure reason: unknown
 │   │    
 │   └── $2 in awk
 └────── $1 in awk

you could do something like this:

FILENAME="/var/log/file.log"

tail -F $FILENAME | awk -v hostname="$HOSTNAME" '
    NR == 1 {
        last=$1 " " $2;
    }
    $1 " " $2 != last {
        printf("%s account failure reason (Errors on %s)=%d\n", hostname, last, failed);
        last=$1 " " $2;
        failed=0;
    }
    /account failure reason/ {
        failed++;
    }
'

Note that I've changed this to tail -F (capital F) because it handles log aging. This isn't supported in every operating system, but it should work in modern BSDs and Linuces.

How does this work?

Awk scripts consist of sets of test { commands; } evaluated against each line of input. (There are two special tests, BEGIN and END whose commands run when awk starts and when awk ends, respectively. In your question, awk never ended, so the END code was never run.)

The script above has three of test/command sections:

In the first, NR == 1 is a test that evaluates true on only the first line of input. The command it runs creates the initial value for the last variable, used in the next section.
In the second section, we test whether the "last" variable has changed since the last line that was evaluated. If this is true, it indicates that we're evaluating a new day's data. Now it's time to print a summary (log) of last month, reset our variables and move on.
In the third, if the line we're evaluating matches the regular expression /account failure reason/, we increment our counter.

Clear as mud? :-)

来源：https://stackoverflow.com/questions/11469959/how-to-pipe-tail-f-into-awk

标签

awk

tail