generating frequency table from file

后端未结

关注

 6  2062

Given an input file containing one single number per line, how could I get a count of how many times an item occurred in that file?

cat input.txt
1
2
1
3
1
0


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  庸人自扰        
                
              
                            
                2021-01-31 08:25
              
            
            
                                                                       
Another option: 

awk '{n[$1]++} END {for (i in n) print i,n[i]}' input.txt | sort -n > output.txt

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  爱一瞬间的悲伤        
                
              
                            
                2021-01-31 08:26
              
            
            
                                                                       
You mean you want a count of how many times an item appears in the input file? First sort it (using -n if the input is always numbers as in your example) then count the unique results.

sort -n input.txt | uniq -c

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-31 08:28
              
            
            
                                                                       
In addition to the other answers, you can use awk to make a simple graph. (But, again, it's not a histogram.) 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2021-01-31 08:35
              
            
            
                                                                       
perl -lne '$h{$_}++; END{for $n (sort keys %h) {print "$n\t$h{$n}"}}' input.txt


Loop over each line with -n

Each $_ number increments hash %h

Once the END of input.txt has been reached,

sort {$a <=> $b} the hash numerically

Print the number $n and the frequency $h{$n}

Similar code which works on floating point:

perl -lne '$h{int($_)}++; END{for $n (sort {$a <=> $b} keys %h) {print "$n\t$h{$n}"}}' float.txt


float.txt

1.732
2.236
1.442
3.162
1.260
0.707


output:

0       1
1       3
2       1
3       1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2021-01-31 08:43
              
            
            
                                                                       
Using maphimbu from the Debian stda package:

# use 'jot' to generate 100 random numbers between 1 and 5
# and 'maphimbu' to print sorted "histogram":
jot -r 100 1 5 | maphimbu -s 1


Output:

             1                20
             2                21
             3                20
             4                21
             5                18


maphimbu also works with floating point:

jot -r 100.0 10 15 | numprocess /%10/ | maphimbu -s 1


Output:

             1                21
           1.1                17
           1.2                14
           1.3                18
           1.4                11
           1.5                19

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦毁少年i        
                
              
                            
                2021-01-31 08:45
              
            
            
                                                                       
At least some of that can be done with

sort output.txt | uniq -c


But the order number count is reversed. This will fix that problem.

sort test.dat | uniq -c | awk '{print $2, $1}'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复