Evaluating command with Awk

偶尔善良 提交于 2019-12-24 04:35:16

问题


The problem is that: I have different txt files in which is registered a timestamp and an ip address for every malware packet that arrives to a server. What I want to do is create another txt file that shows, for every ip, the first time a malware packet arrives.

In general I want to do something like this :

for every  line in file.txt
 if (ip is not present in list.txt)
 copy timestamp and ip in list.txt

I'm using awk for doing it. The main problem is the "if ip is not present in list.txt". I'm doing this:

 {    a=$( grep -w "$3" list.txt | wc -c );
    if ( a == 0 )
   {
     #copy timestamp and ip in list.txt
   }

( i'm using $3 because the ip address is in the third column of the source file )

I don't know how to make awk evaluate the grep function. I've tried with backticks also but it didn't work. Someone could give me some hint?

I'm testing my script on test file like this:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
13  192.168.1.1
13  192.168.1.2
13  122.11.22.11
14  122.11.22.11
15  122.11.22.11
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

What should I obtain is:

10  192.168.1.1
11  192.168.1.2
12  192.165.2.4
13  122.11.22.11    
15  122.11.22.144
15  122.11.2.11
15  122.11.22.111

Thanks to your help I've succeded in creating the script that fits my needs :

awk '
FILENAME == ARGV[1] {
    ip[$2] = 1
    next
}
! ($2 in ip) {
    print $1, $2 >> ARGV[1]
    ip[$2] = 1
}
' list.txt file.txt 

回答1:


But really what you want to do is get awk to read the list.txt file first, then process the other file with the list.txt data in memory. This will allow you to avoid calling system() for each line.

I assume the ip is in the 1st column of list.txt.

When you say copy timestamp and ip in list.txt, I assume you want to append some info from the current line of file.txt to the list.txt file.

awk '
    FILENAME == ARGV[1] {
        ip[$1] = 1
        next
    }
    ! ($3 in ip) {
        print $3, $(whatevever_column_holds_timestamp) >> ARGV[1]
    }
' list.txt file.txt

Given the sample file and simplified requirements of your question update:

awk '! seen[$2]++' filename

will produce the results you've seen. That awk program will print the line if the IP has not yet been seen.




回答2:


Interpreting the question as "How can I evaluate the status of a command from within awk?", just use system.

{
  if( system( "cmd" ) == 0 ) {
    # the command succeeded
  {
}

So, in your case, just do:

{
  if( system( "grep -w \"" $3 "\" list.txt > /dev/null " ) == 0 ) {
    ...
  }
}

You might want to reconsider your approach to the problem, though. Grepping each time is computationally expensive, and there are better ways to approach the problem. (Read list.txt once into an array, for example.)

Also, note that you do not need to use wc. grep fails if it doesn't match the string. Use the return value rather than parsing the output.




回答3:


This will save the result of execution into variable a

BEGIN {  } 
{
"grep -w \"$3\" list.txt | wc -c" | getline a
print a
}
END   {}



回答4:


You want to use getline:

BEGIN {
    "date" | getline current_time
     close("date")
     print "Report printed on " current_time
}

That takes the output of date and puts it into the current_time variable. You should be able to do the same with your grep | wc -l.



来源:https://stackoverflow.com/questions/7741700/evaluating-command-with-awk

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!