Using awk to count the number of occurrences of a word in a column

后端未结

关注

 6  1769

03/03/2014 12:31:21 BLOCK 10.1.34.1 11:22:33:44:55:66

03/03/2014 12:31:22 ALLOW 10.1.34.2 AA:BB:CC:DD:EE:FF

03/03/2014 12:31:25 BLOCK 10.1.34.1 55:66:77:88:99:AA
<


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2020-12-03 11:10
              
            
            
                                                                       
Here is a non-code solution. You can string together the steps with pipes ( "|" ).

awk '{print $3}' file | sort | uniq -c



awk '{print $3}'

print the 3rd column , the default record separator in awk is white space.
sort

sort the results
uniq -c

count the number repeated occurrences

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2020-12-03 11:16
              
            
            
                                                                       
The error in your awk invocation is that, in your "END" block, you have print $count. That takes the content of the count variable, assumes it is an integer, and attempts to find the corresponding field in the last line of input. What you really want is just print count, as that just prints the value in the count variable. It's sometimes easy to mix up different variable referencing schemes between bash, awk, python, etc., so it's an easy mistake to make.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  爱一瞬间的悲伤        
                
              
                            
                2020-12-03 11:19
              
            
            
                                                                       
The reason is that you just need to print count rather than $count. Inside awk, you do not need to use $ to find variable. In your case, the awk will try to print $2 before ending which does not exit. Below code should work:

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} END {print count}' firewall.log
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2020-12-03 11:23
              
            
            
                                                                       
The reason that your code may not be working is END is case sensitive so your script will be checking the variable end exists(which it doesn't) and so the last block will never be executed.
If you change that then it should work.

Also you do not need the BEGIN block as all variable are instantiated at 0.

Below I have added an alternative way of doing this that you may want to use instead.

This is similar to glenn's but captures only the words you want, it should use little memory because of this.



Using Gawk(for the third arg of match)

awk 'match($3,/BLOCK|ALLOW/,b){a[b[0]]++}END{for(i in a)print i ,a[i]}' file


This block only executes if BLOCK or ALLOW are contained in the third field.

The match captures what has been matched into the array b.

Then array a is incremented for the matched field.

In the END block each captured field is outputted with a count of occurences.



The output is

ALLOW 1
BLOCK 2

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-12-03 11:24
              
            
            
                                                                       
I tested your statement

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} end {print $count}' firewall.log


and was able to successfully count BLOCK by doing two changes


end should be in caps
remove $ from print $count


So, it should be:

awk ' BEGIN {count=0;}  { if ($3 == "BLOCK") count+=1} END {print count}' firewall.log 


A simpler statement that works too is:

awk '($3 == "BLOCK") {count++ } END { print count }' firewall.log

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2020-12-03 11:27
              
            
            
                                                                       
Use an array

awk '{count[$3]++} END {for (word in count) print word, count[word]}' file


If you want "block" specifically: END {print count["BLOCK"]}
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复