How does this perl one-liner display lines that 2 files have in common?
perl -ne \'print if ($seen{$_} .= @ARGV) =~ /10$/\' file1 file2
@ARGV is shifted when opening the first file. In scalar context, it now returns 1 (because it has one member, the second file). For each line, this 1 is appended to the hash %seen. When the second file is opened, @ARGV is shifted again and is now empty, so returns 0 in the scalar context. /10$/
means "the line was seen in file1 and now it has been seen in file2 for the first time".
The -n
command line option transforms the code to something equivalent to
while ($ARGV = shift @ARGV) {
open ARGV, $ARGV;
LINE: while (defined($_ = <ARGV>)) {
$seen{$_} .= @ARGV;
print $_ if $seen{$_} =~ /10$/;
}
}
While the first file is being read, scalar @ARGV
is 1
. For each line, 1
will be appended to the %seen
entry.
While the second file is being read, scalar @ARGV
is 0
. So if a line was in file 1 and in file2, the entry will look like 1110000
(it was 3× in file1, 4× in file2).
We only want to output common lines exactly one time. We do this when a common line was first seen in file2, so $seen{$_}
is 1110
. This is expressed as the regex /10$/
: The string 10
must appear at the end.