I have a huge file (my_file.txt) with ~ 8,000,000 lines that looks like this:
1 13110 13110 rs540538026 0 NA -1.33177622457982
1 13116 13116 rs626
$ awk '(i=$1 FS $2 FS $3) && !(i in seventh) || seventh[i] < $7 {seventh[i]=$7; all[i]=$0} END {for(i in a) print all[i]}' my_file.txt
1 13013178 13013178 rs11122075 0 NA -1.57404917386838
1 13116 13116 rs62635286 0 NA -2.87540758021667
1 13118 13118 rs200579949 0 NA -2.87540758021667
1 13110 13110 rs540538026 0 NA -1.33177622457982
Thanks to @fedorqui for the advanced indexing. :D
Explained:
(i=$1 FS $2 FS $3) && !(i in seventh) || $7 > seventh[i] { # set index to first 3 fields
# AND if index not yet stored in array
# OR the seventh field is greater than the previous value of the seventh field by the same index:
seventh[i]=$7 # new biggest value
all[i]=$0 # store that record
}
END {
for(i in all) # for all stored records of the biggest seventh value
print all[i] # print them
}