Script to find duplicates in a csv file

前端 未结 5 1135
旧巷少年郎
旧巷少年郎 2021-01-17 17:02

I have a 40 MB csv file with 50,000 records. Its a giant product listing. Each row has close to 20 fields. [Item#, UPC, Desc, etc]

How can I,

a) Find and Pri

5条回答
  •  不知归路
    2021-01-17 17:39

    Find and print duplicate rows in Perl:

    perl -ne 'print if $SEEN{$_}++' < input-file
    

    Find and print rows with duplicate columns in Perl -- let's say the 5th column of where fields are separated by commas:

    perl -F/,/ -ane 'print if $SEEN{$F[4]}++' < input-file
    

提交回复
热议问题