Perl - Find duplicate lines in file or array

淺唱寂寞╮ 提交于 2019-11-28 06:38:38
Axeman

Using the standard Perl shorthands:

my %seen;
while ( <> ) { 
    print if $seen{$_}++;
}

As a "one-liner":

perl -ne 'print if $seen{$_}++'

More data? This prints <file name>:<line number>:<line>:

perl -ne 'print ( $ARGV eq "-" ? "" : "$ARGV:" ), "$.:$_" if $seen{$_}++'

Explanation of %seen:

  • %seen declares a hash. For each unique line in the input (which is coming from while(<>) in this case) $seen{$_} will have a scalar slot in the hash named by the the text of the line (this is what $_ is doing in the has {} braces).
  • Using the postfix increment operator (x++) we take the value for our expression, remembering to increment it after the expression. So, if we haven't "seen" the line $seen{$_} is undefined--but when forced into an numeric "context" like this, it's taken as 0--and false.
  • Then it's incremented to 1.

So, when the while begins to run, all lines are "zero" (if it helps you can think of the lines as "not %seen") then, the first time we see a line, perl takes the undefined value - which fails the if - and increments the count at the scalar slot to 1. Thus, it is 1 for any future occurrences at which point it passes the if condition and it printed.

Now as I said above, %seen declares a hash, but with strict turned off, any variable expression can be created on the spot. So the first time perl sees $seen{$_} it knows that I'm looking for %seen, it doesn't have it, so it creates it.

An added neat thing about this is that at the end, if you care to use it, you have a count of how many times each line was repeated.

try this

#!/usr/bin/perl -w
use strict;
use warnings;

my %duplicates;
while (<DATA>) {
    print if !defined $duplicates{$_};
    $duplicates{$_}++;
}
Alex B

Prints dupes only once:

perl -ne "print if $seen{$_}++ == 1"

If you have a Unix-like system, you can use uniq:

uniq -d foo

or

uniq -D foo

should do what you want. More information: man uniq.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!