Awk: using a file to filter another one (out.tr)

后端未结

关注

 1  1414

星月不相逢 2021-02-02 04:18

Help with awk, using a file to filter another one I have a main file:

...
17,466971 0,095185 17,562156 id 676
17,466971 0,096694 17,563665 id 677
17,466971 0,098


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   闹比i
                                             
                
                
                (楼主)
            
              
              
                2021-02-02 04:33
              

            
            
                        
Here's one way using awk:

awk 'FNR==NR { a[$NF]; next } !($NF in a)' other main


Results:

17,466971 0,095185 17,562156 id 676
17,466971 0,096694 17,563665 id 677
17,466971 0,09816 17,565131 id 678
17,466971 0,099625 17,566596 id 679
17,466971 0,101091 17,568062 id 680
17,466971 0,101793 17,568764 id 682
17,466971 0,10253 17,569501 id 683
38,166772 0,08125 38,248022 id 1572
38,166772 0,082545 38,249317 id 1573
38,233772 0,082113 38,315885 id 1575
38,299771 0,081412 38,381183 id 1576
38,299771 0,083627 38,383398 id 1578
38,299771 0,085093 38,384864 id 1579
38,299771 0,085094 38,384865 id 1581


Drop the exclamation mark to show the 'deleted' lines:

awk 'FNR==NR { a[$NF]; next } $NF in a' other main


Results:

17,466971 0,016175 17,483146 id 681
38,233772 0,005457 38,239229 id 1574
38,299771 0,006282 38,306053 id 1577
38,299771 0,008682 38,308453 id 1580




Alternatively, if you'd like two files, one containing values 'present' and the other containing values 'deleted', try:

awk 'FNR==NR { a[$NF]; next } { print > ($NF in a ? "deleted" : "present") }' other main




Explanation1:

FNR==NR { ... } is a commonly used construct that returns true for only the first file in the arguments list. In this case, awk will read the file 'other' first. When this file is being processed, the value in the last column ($NF) is added to an array (which we have called a). next then skips processing the rest of our code. Once the first file has been read, FNR will no longer be equal to NR, thus awk will be 'allowed' to skip the FNR--NR { ... } block and begin processing the remainder of the code which is applied to the second file in the arguments list, 'main'. For example, !($NF in a), will not print the line if $NF is not in the array.

Explanation2:

With regards to which column, you may find this helpful:

$1         # the first column
$2         # the second column
$3         # the third column

$NF        # the last column
$(NF-1)    # the second last column
$(NF-2)    # the third last column

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复