问题
The problem is that: I have different txt files in which is registered a timestamp and an ip address for every malware packet that arrives to a server. What I want to do is create another txt file that shows, for every ip, the first time a malware packet arrives.
In general I want to do something like this :
for every line in file.txt
if (ip is not present in list.txt)
copy timestamp and ip in list.txt
I'm using awk for doing it. The main problem is the "if ip is not present in list.txt". I'm doing this:
{ a=$( grep -w "$3" list.txt | wc -c );
if ( a == 0 )
{
#copy timestamp and ip in list.txt
}
( i'm using $3 because the ip address is in the third column of the source file )
I don't know how to make awk evaluate the grep function. I've tried with backticks also but it didn't work. Someone could give me some hint?
I'm testing my script on test file like this:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
13 192.168.1.1
13 192.168.1.2
13 122.11.22.11
14 122.11.22.11
15 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
What should I obtain is:
10 192.168.1.1
11 192.168.1.2
12 192.165.2.4
13 122.11.22.11
15 122.11.22.144
15 122.11.2.11
15 122.11.22.111
Thanks to your help I've succeded in creating the script that fits my needs :
awk '
FILENAME == ARGV[1] {
ip[$2] = 1
next
}
! ($2 in ip) {
print $1, $2 >> ARGV[1]
ip[$2] = 1
}
' list.txt file.txt
回答1:
But really what you want to do is get awk to read the list.txt file first, then process the other file with the list.txt data in memory. This will allow you to avoid calling system()
for each line.
I assume the ip is in the 1st column of list.txt.
When you say copy timestamp and ip in list.txt
, I assume you want to append some info from the current line of file.txt to the list.txt file.
awk '
FILENAME == ARGV[1] {
ip[$1] = 1
next
}
! ($3 in ip) {
print $3, $(whatevever_column_holds_timestamp) >> ARGV[1]
}
' list.txt file.txt
Given the sample file and simplified requirements of your question update:
awk '! seen[$2]++' filename
will produce the results you've seen. That awk program will print the line if the IP has not yet been seen.
回答2:
Interpreting the question as "How can I evaluate the status of a command from within awk?", just use system.
{ if( system( "cmd" ) == 0 ) { # the command succeeded { }
So, in your case, just do:
{ if( system( "grep -w \"" $3 "\" list.txt > /dev/null " ) == 0 ) { ... } }
You might want to reconsider your approach to the problem, though. Grepping each time is computationally expensive, and there are better ways to approach the problem. (Read list.txt once into an array, for example.)
Also, note that you do not need to use wc. grep fails if it doesn't match the string. Use the return value rather than parsing the output.
回答3:
This will save the result of execution into variable a
BEGIN { }
{
"grep -w \"$3\" list.txt | wc -c" | getline a
print a
}
END {}
回答4:
You want to use getline:
BEGIN {
"date" | getline current_time
close("date")
print "Report printed on " current_time
}
That takes the output of date
and puts it into the current_time variable. You should be able to do the same with your grep | wc -l.
来源:https://stackoverflow.com/questions/7741700/evaluating-command-with-awk