Sort and remove duplicates based on column

前端 未结 2 1013
太阳男子
太阳男子 2021-01-18 04:47

I have a text file:

$ cat text
542,8,1,418,1
542,9,1,418,1
301,34,1,689070,1
542,9,1,418,1
199,7,1,419,10

I\'d like to sort the file based on

相关标签:
2条回答
  • 2021-01-18 05:17

    When sorting on a key, you must provide the end of the key as well, otherwise sort uses all following keys as well.

    The following should work:

    sort -t, -u -k1,1n text
    
    0 讨论(0)
  • 2021-01-18 05:19

    The problem is that when you provide a key to sort the unique occurrences are looked for that particular field. Since the line 542,8,1,418,1 is displayed, sort sees the next two lines starting with 542 as duplicate and filters them out.

    Your best bet would be to either sort all columns:

    sort -t, -nk1,1 -nk2,2 -nk3,3 -nk4,4 -nk5,5 -u text
    

    or

    use awk to filter duplicate lines and pipe it to sort.

    awk '!_[$0]++' text | sort -t, -nk1,1
    
    0 讨论(0)
提交回复
热议问题