awk: keep records with the highest value, comparing those that share other fields

问题

I'm trying to write an awk script that keeps the records with a highest value in a given field, but only comparing records that share two other fields.

I'd better give an example -- this is the input.txt:

X A 10.00
X A 1.50
X B 0.01
X B 4.00
Y C 1.00
Y C 2.43

I want to compare all the records sharing the same value in the 1st and 2nd fields (X A, X B or Y C) and pick the one with a highest numerical value in the 3rd field.

So, I expect this output:

X A 10.00
X B 4.00
Y C 2.43

With this snippet I am able to pick the record with max value in the 3rd field (but it's not taking into account the previous fields, and it's not outputting them either):

awk 'BEGIN {max = 0} {if ($2>max) max=$2} END {print max}' input.txt

Current (unwanted) output:

10.00

Any ideas? I can use gawk.

Thanks a lot in advance!

回答1:

You can use this awk:

awk '{k=$1 OFS $2} $3>a[k]{a[k]=$3} END{for (i in a) print i, a[i]}' file
X A 10.00
X B 4.00
Y C 2.43

来源：https://stackoverflow.com/questions/29239080/awk-keep-records-with-the-highest-value-comparing-those-that-share-other-field

标签

bash

awk

gawk

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!