Unix command to find lines common in two files

前端 未结 11 1220
忘掉有多难
忘掉有多难 2020-11-27 10:32

I\'m sure I once found a unix command which could print the common lines from two or more files, does anyone know its name? It was much simpler than diff.

相关标签:
11条回答
  • 2020-11-27 10:33

    Just for reference if someone is still looking on how to do this for multiple files, see the linked answer to Finding matching lines across many files.


    Combining these two answers (ans1 and ans2), I think you can get the result you are needing without sorting the files:

    #!/bin/bash
    ans="matching_lines"
    
    for file1 in *
    do 
        for file2 in *
            do 
                if  [ "$file1" != "$ans" ] && [ "$file2" != "$ans" ] && [ "$file1" != "$file2" ] ; then
                    echo "Comparing: $file1 $file2 ..." >> $ans
                    perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' $file1 $file2 >> $ans
                fi
             done 
    done
    

    Simply save it, give it execution rights (chmod +x compareFiles.sh) and run it. It will take all the files present in the current working directory and will make an all-vs-all comparison leaving in the "matching_lines" file the result.

    Things to be improved:

    • Skip directories
    • Avoid comparing all the files two times (file1 vs file2 and file2 vs file1).
    • Maybe add the line number next to the matching string
    0 讨论(0)
  • 2020-11-27 10:35

    Maybe you mean comm ?

    Compare sorted files FILE1 and FILE2 line by line.

    With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.

    The secret in finding these information are the info pages. For GNU programs, they are much more detailed than their man-pages. Try info coreutils and it will list you all the small useful utils.

    0 讨论(0)
  • 2020-11-27 10:38
    awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2
    
    0 讨论(0)
  • 2020-11-27 10:40

    If the two files are not sorted yet, you can use:

    comm -12 <(sort a.txt) <(sort b.txt)
    

    and it will work, avoiding the error message comm: file 2 is not in sorted order when doing comm -12 a.txt b.txt.

    0 讨论(0)
  • 2020-11-27 10:41

    The command you are seeking is comm. eg:-

    comm -12 1.sorted.txt 2.sorted.txt
    

    Here:

    -1 : suppress column 1 (lines unique to 1.sorted.txt)

    -2 : suppress column 2 (lines unique to 2.sorted.txt)

    0 讨论(0)
  • 2020-11-27 10:42
    rm file3.txt
    
    cat file1.out | while read line1
    do
            cat file2.out | while read line2
            do
                    if [[ $line1 == $line2 ]]; then
                            echo $line1 >>file3.out
                    fi
            done
    done
    

    This should do it.

    0 讨论(0)
提交回复
热议问题