awk: create list of destination ports seen for each source IP from a bro log (conn.log)

孤人 提交于 2020-01-05 08:15:12

问题


I'm trying to solve a problem in awk as an exercise but I'm having trouble. I want awk (or gawk) to be able to print all unique destination ports for a particular source IP address.

The source IP address is field 1 ($1) and the destination port is field 4 ($4).

Cut for brevity:
SourceIP          SrcPort   DstIP           DstPort
192.168.1.195       59508   98.129.121.199  80
192.168.1.87        64802   192.168.1.2     53
10.1.1.1            41170   199.253.249.63  53
10.1.1.1            62281   204.14.233.9    443

I imagine you would store each Source IP as in index to an array. But I'm not quite sure how you would store destination ports as values. Maybe you can keep appending to a string, being the value of the index e.g. "80,"..."80,443,"... for each match. But maybe that's not the best solution.

I'm not too concerned about output, I really just want to see how one can approach this in awk. Though, for output I was thinking something like,

Source IP:dstport, dstport, dstport
192.168.1.195:80,443,8088,5900

I'm tinkering with something like this,

awk '{ if ( NR == 1) next; arr[$1,$4] = $4 } END { for (i in arr) print arr[i] }' infile

but cannot figure out how to print out the elements and their values for a two-dimensional array. It seems something along this line would take care of the unique destination port task because each port is overwriting the value of the element.

Note: awk/gawk solution will get the answer!

Solution EDIT: slightly modified Kent's solution to print unique destination ports as mentioned in my question and to skip the column header line.

awk '{ if ( NR == 1 ) next ; if ( a[$1] && a[$1] !~ $4 ) a[$1] = a[$1]","$4; else a[$1] = $4 } END {for(x in a)print x":"a[x]}'

回答1:


here is one way with awk:

 awk '{k=$1;a[k]=a[k]?a[k]","$4:$4}END{for(x in a)print x":"a[x]}' file

with your example, the output is:

kent$  awk '{k=$1;a[k]=a[k]?a[k]","$4:$4}END{for(x in a)print x":"a[x]}' file                                                                                               
192.168.1.195:80
192.168.1.87:53
10.1.1.1:53,443

(I omitted the title line)

EDIT

k=$1;a[k]=a[k]?a[k]","$4:$4

is exactly same as:

if (a[$1])                   # if a[$1] is not empty
    a[$1] = a[$1]","$4       # concatenate $4 to it separated by ","
else                         # else if a[$1] is empty
    a[$1] = $4               # let a[$1]=$4

I used k=$1 just for saving some typing. also the x=boolean?a:b expression

I hope the explanation could let you understand the codes.




回答2:


I prefer a solution using perl because I like more the posibilities of creating data structures like hash of arrays:

perl -ane '
    ## Same BEGIN block than AWK. It prints header before processing any input.
    BEGIN { printf qq|%s:%s\n|, q|Source IP|, q|dstport| }

    ## Skip first input line (header).
    next if $. == 1;

    ## This is what you were thinking to achieve. Store source IP as key of a 
    ## hash, and instead of save a string, it will save an array with all
    ## ports.
    push @{ $ip{ $F[0] } }, $F[ 3 ]; 

    ## Same END block than AWK. For each IP, get all ports saved in the array
    ## and join them using a comma.
    END { printf qq|%s:%s\n|, $_, join q|,|, @{ $ip{ $_ } } for keys %ip }

' infile

It yields:

Source IP:dstport
192.168.1.195:80
10.1.1.1:53,443
192.168.1.87:53


来源:https://stackoverflow.com/questions/16742955/awk-create-list-of-destination-ports-seen-for-each-source-ip-from-a-bro-log-co

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!