How can I find the unique lines and remove all duplicates from a file? My input file is
1
1
2
3
5
5
7
7
I would like the result to be:
You could also print out the unique value in "file" using the cat
command by piping to sort
and uniq
cat file | sort | uniq -u
Use as follows:
sort < filea | uniq > fileb
you can use:
sort data.txt| uniq -u
this sort data and filter by unique values
I find this easier.
sort -u input_filename > output_filename
-u
stands for unique.
uniq
should do fine if you're file is/can be sorted, if you can't sort the file for some reason you can use awk
:
awk '{a[$0]++}END{for(i in a)if(a[i]<2)print i}'
While sort
takes O(n log(n)) time, I prefer using
awk '!seen[$0]++'
awk '!seen[$0]++'
is an abbreviation for awk '!seen[$0]++ {print}'
, print line(=$0) if seen[$0]
is not zero.
It take more space but only O(n) time.