Merging two files into one based on the first column

后端未结

关注

 3  926

I have two files, both in the same format -- two columns both containing a number, for example:

file 1

1.00    99
2.00    343
3.00


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2021-01-14 11:27
              
            
            
                                                                       
You can do it with Alacon - command-line utility for Alasql database.

It works with Node.js, so you need to install Node.js and then Alasql package:

To join two data from tab-separated files you can use the following command:

> node alacon "SELECT * INTO TSV("main.txt") FROM TSV('data1.txt') data1 
                   JOIN TSV('data2.txt') data2 USING [0]"


This is one very long line. In this example all files have data in "Sheet1" sheets. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  余生分开走        
                
              
                            
                2021-01-14 11:28
              
            
            
                                                                       
join file1 file2


Which assumes that the files are sorted on the join field. If they are not, you can do this:

join <(sort -V file1) <(sort -V file2)


Here's an AWK version (the sort compensates for AWK's non-deterministic array ordering):

awk '{a[$1]=a[$1] FS $2} END {for (i in a) print i a[i]}' file1 file2 | sort -V


It seems shorter and more readable than the Perl answer.

In gawk 4, you can set the array traversal order:

awk 'BEGIN {PROCINFO["sorted_in"] = "@ind_num_asc"} {a[$1]=a[$1] FS $2} END {for (i in a) print i a[i]}' file1 file2


and you won't have to use the sort utility. @ind_num_asc is Index Numeric Ascending. See Controlling Array Traversal and Array Sorting and Using Predefined Array Scanning Orders with gawk.

Note that -V (--version-sort) in the sort commands above requires GNU sort from coreutils 7.0 or later. Thanks for @simlev pointing out that it should be used if available.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2021-01-14 11:34
              
            
            
                                                                       
A Perl-solution

perl -anE 'push @{$h{$F[0]}}, $F[1]; END{ say "$_\t$h{$_}->[0]\t$h{$_}->[1]" for sort{$a<=>$b} keys %h }' file_1 file_2 > file_3


Ok, looking at the awk-oneliner this is shorter then my first try and it has the nicer output then the awk-oneliner and it doesn't use the 'pipe sort -n':

perl -anE '$h{$F[0]}="$h{$F[0]}\t$F[1]"; END{say "$_$h{$_}" for sort {$a<=>$b} keys %h}' file_1 file_2


And the one-liners behave different then the join-example if there are entries with no value in the second column in the first file.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复