awk average part of column if lines (specific field) match

后端 未结 3 1073
终归单人心
终归单人心 2021-01-27 13:04

Here is a sample of my input file :

$cat NDVI-bm  
P01 031.RAW 0.516 0 0  
P01 021.RAW 0.449 0 0  
P02 045.RAW 0.418 0 0  
P03 062.RAW 0.570 0 0  
P03 064.RAW 0.         


        
相关标签:
3条回答
  • 2021-01-27 13:18
    { total[$1] += $3; ++n[$1] }
    
    END { for(i in total) print i, total[i] / n[i] }
    
    0 讨论(0)
  • 2021-01-27 13:31

    Have a different array where you keep track of the number of entries you have seen for each index, and do the division in the END block.

    0 讨论(0)
  • 2021-01-27 13:39

    E.g to calculate the average of lines starting with "P01":

    /^P01/{
        num+=1
        cnt+=$3
    }
    END {print "avg = " cnt/num}
    

    Output:

    $ awk -f avg.awk input
    avg = 0.4825
    

    ...or, as a oneliner:

    $ awk '/^P01/{cnt+=$3; num+=1} END{print "avg="cnt/num}' input
    

    Or to do the calculations for all values of the first column simultaneously:

    {
        sum[$1]+=$3
        cnt[$1]++
    }
    
    
    END {
        print "Name" "\t" "sum" "\t" "cnt" "\t" "avg"
        for (i in sum)
            print i "\t" sum[i] "\t" cnt[i] "\t" sum[i]/cnt[i]
    
    }
    

    Outputs:

    $ awk -f avg.awk input
    Name    sum     cnt     avg
    P01     0.965   2       0.4825
    P02     0.418   1       0.418
    P03     1.039   2       0.5195
    P04     2.481   4       0.62025
    P05     0.748   1       0.748
    
    0 讨论(0)
提交回复
热议问题