问题
I'm trying to write an awk script that keeps the records with a highest value in a given field, but only comparing records that share two other fields.
I'd better give an example -- this is the input.txt:
X A 10.00
X A 1.50
X B 0.01
X B 4.00
Y C 1.00
Y C 2.43
I want to compare all the records sharing the same value in the 1st and 2nd fields (X A, X B or Y C) and pick the one with a highest numerical value in the 3rd field.
So, I expect this output:
X A 10.00
X B 4.00
Y C 2.43
With this snippet I am able to pick the record with max value in the 3rd field (but it's not taking into account the previous fields, and it's not outputting them either):
awk 'BEGIN {max = 0} {if ($2>max) max=$2} END {print max}' input.txt
Current (unwanted) output:
10.00
Any ideas? I can use gawk.
Thanks a lot in advance!
回答1:
You can use this awk:
awk '{k=$1 OFS $2} $3>a[k]{a[k]=$3} END{for (i in a) print i, a[i]}' file
X A 10.00
X B 4.00
Y C 2.43
来源:https://stackoverflow.com/questions/29239080/awk-keep-records-with-the-highest-value-comparing-those-that-share-other-field