Finding common value across multiple files containing single column values

后端 未结 4 1197
温柔的废话
温柔的废话 2020-12-11 22:13

I have 100 text files containing single columns each. The files are like:

file1.txt
10032
19873
18326

file2.txt
10032
19873
11254

file3.txt
15478
10032
112         


        
相关标签:
4条回答
  • 2020-12-11 22:23

    awk to the rescue!

    to find the common element in all files (assuming uniqueness within the same file)

    awk '{a[$1]++} END{for(k in a) if(a[k]==ARGC-1) print k}' files
    

    count all occurrences and print the values where count equals number of files.

    0 讨论(0)
  • 2020-12-11 22:25

    This will work whether or not the same number can appear multiple times in 1 file:

    $ awk '{a[$0][ARGIND]} END{for (i in a) if (length(a[i])==ARGIND) print i}' file[123]
    10032
    

    The above uses GNU awk for true multi-dimensional arrays and ARGIND. There's easy tweaks for other awks if necessary, e.g.:

    $ awk '!seen[$0,FILENAME]++{a[$0]++} END{for (i in a) if (a[i]==ARGC-1) print i}' file[123]
    10032
    

    If the numbers are unique in each file then all you need is:

    $ awk '(++c[$0])==(ARGC-1)' file*
    10032
    
    0 讨论(0)
  • 2020-12-11 22:28

    One using Bash and comm because I needed to know if it would work. My test files were 1, 2 and 3, hence the for f in ?:

    f=$(shuf -n1 -e ?)                     # pick one file randomly for initial comms file
    
    sort "$f" > comms 
    
    for f in ?                             # this time for all files
    do 
      comm -1 -2 <(sort "$f") comms > tmp  # comms should be in sorted order always
      # grep -Fxf "$f" comms > tmp         # another solution, thanks @Sundeep
      mv tmp comms
    done
    
    0 讨论(0)
  • 2020-12-11 22:38

    Files with a single column?

    You can sort and compare this files, using shell:

    for f in file*.txt; do sort $f|uniq; done|sort|uniq -c -d
    

    Last -c is not necessary, it's need only if you want to count number of occurences.

    0 讨论(0)
提交回复
热议问题