I have a text file that contains a long list of entries (one on each line). Some of these are duplicates, and I would like to know if it is possible (and if so, how) to remove
:%s/^\(.*\)\(\n\1\)\+$/\1/gec
or
:%s/^\(.*\)\(\n\1\)\+$/\1/ge
this is my answer for you ,it can remove multiple duplicate lines and only keep one not remove !
Try this:
:%s/^\(.*\)\(\n\1\)\+$/\1/
It searches for any line immediately followed by one or more copies of itself, and replaces it with a single copy.
Make a copy of your file though before you try it. It's untested.
I would combine two of the answers above:
go to head of file
sort the whole file
remove duplicate entries with uniq
1G
!Gsort
1G
!Guniq
If you were interested in seeing how many duplicate lines were removed, use control-G before and after to check on the number of lines present in your buffer.
Regarding how Uniq can be implemented in VimL, search for Uniq in a plugin I'm maintaining. You'll see various ways to implement it that were given on Vim mailing-list.
Otherwise, :sort u
is indeed the way to go.
Select the lines in visual-line mode (Shift+v), then :!uniq
. That'll only catch duplicates which come one after another.
If you don't want to sort/uniq the entire file, you can select the lines you want to make uniq in visual mode and then simply: :sort u
.