I\'m sure I once found a unix command which could print the common lines from two or more files, does anyone know its name? It was much simpler than diff
.
Just for reference if someone is still looking on how to do this for multiple files, see the linked answer to Finding matching lines across many files.
Combining these two answers (ans1 and ans2), I think you can get the result you are needing without sorting the files:
#!/bin/bash
ans="matching_lines"
for file1 in *
do
for file2 in *
do
if [ "$file1" != "$ans" ] && [ "$file2" != "$ans" ] && [ "$file1" != "$file2" ] ; then
echo "Comparing: $file1 $file2 ..." >> $ans
perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' $file1 $file2 >> $ans
fi
done
done
Simply save it, give it execution rights (chmod +x compareFiles.sh
) and run it. It will take all the files present in the current working directory and will make an all-vs-all comparison leaving in the "matching_lines" file the result.
Things to be improved:
Maybe you mean comm
?
Compare sorted files FILE1 and FILE2 line by line.
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.
The secret in finding these information are the info pages. For GNU programs, they are much more detailed than their man-pages. Try info coreutils
and it will list you all the small useful utils.
awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2
If the two files are not sorted yet, you can use:
comm -12 <(sort a.txt) <(sort b.txt)
and it will work, avoiding the error message comm: file 2 is not in sorted order
when doing comm -12 a.txt b.txt
.
The command you are seeking is comm. eg:-
comm -12 1.sorted.txt 2.sorted.txt
Here:
-1 : suppress column 1 (lines unique to 1.sorted.txt)
-2 : suppress column 2 (lines unique to 2.sorted.txt)
rm file3.txt
cat file1.out | while read line1
do
cat file2.out | while read line2
do
if [[ $line1 == $line2 ]]; then
echo $line1 >>file3.out
fi
done
done
This should do it.