group by 'last' value in bash

后端 未结 4 1251
借酒劲吻你
借酒劲吻你 2021-01-26 06:48

I have a two-column file:

1,112
1,123
2,123
2,124
2,144
3,158
4,123
4,158
5,123

I need to know last column2 value for each column1:

<         


        
相关标签:
4条回答
  • 2021-01-26 07:33

    Couple of solutions:

    1) With tac to reverse input file and sort

    $ tac ip.txt | sort -u -t, -k1,1n
    1,123
    2,144
    3,158
    4,158
    5,123
    

    2) With perl

    $ perl -F, -ne '$h{$F[0]} = $_; END{print $h{$_} foreach (sort {$a <=> $b} keys %h)}' ip.txt 
    1,123
    2,144
    3,158
    4,158
    5,123
    

    Input lines split on , and hash variable keeps updating based on first field, effectively throwing away previous lines if first field matches. At end, the hash variable is printed based on sorted keys

    Thanks @choroba for pointing out that numeric sort is needed in both cases

    0 讨论(0)
  • 2021-01-26 07:33

    This is pretty similar to @Sundeep's solution but here it goes:

    $ tac file|uniq -w 1|tac
    1,123
    2,144
    3,158
    4,158
    5,123
    

    ie. reverse the record order with cat, uniq outputs based on the first character only and then the order is reversed again.

    0 讨论(0)
  • 2021-01-26 07:34

    With GNU bash:

    declare -A array   # associative array
    
    # read from file
    while IFS=, read a b; do array[$a]="$b"; done < file
    
    # print array
    for i in "${!array[@]}"; do echo "$i,${array[$i]}"; done
    

    Output:

    1,123
    2,144
    3,158
    4,158
    5,123
    
    0 讨论(0)
  • 2021-01-26 07:49

    You can use awk delimit on , to store each $2 in an array using key as $1:

    awk 'BEGIN{FS=OFS=","} {seen[$1]=$2} END{for (i in seen) print i, seen[i]}' file.csv
    
    1,123
    2,144
    3,158
    4,158
    5,123
    
    0 讨论(0)
提交回复
热议问题