How to normalize a histogram in python?

前端未结

关注

 5  1168

I\'m trying to plot normed histogram, but instead of getting 1 as maximum value on y axis, I\'m getting different numbers.

For array k=(1,4,3,1)

 import


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  爱一瞬间的悲伤        
                
              
                            
                2021-02-12 11:48
              
            
            
                                                                       
A normed histogram is defined such that the sum of products of width and height of each column is equal to the total count. That's why you are not getting your max equal to one.

However, if you still want to force it to be 1, you could use numpy and matplotlib.pyplot.bar in the following way

sample = np.random.normal(0,10,100)
#generate bins boundaries and heights
bin_height,bin_boundary = np.histogram(sample,bins=10)
#define width of each column
width = bin_boundary[1]-bin_boundary[0]
#standardize each column by dividing with the maximum height
bin_height = bin_height/float(max(bin_height))
#plot
plt.bar(bin_boundary[:-1],bin_height,width = width)
plt.show()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2021-02-12 11:53
              
            
            
                                                                       
How the lines above:

weights = np.ones_like(myarray)/float(len(myarray))
plt.hist(myarray, weights=weights)


should work when I have a stacked histogram like this?-

n, bins, patches = plt.hist([from6to10, from10to14, from14to18, from18to22,  from22to6],
label= ['06:00-10:00','10:00-14:00','14:00-18:00','18:00- 22:00','22:00-06:00'],
stacked=True,edgecolor='black', alpha=0.8, linewidth=0.5, range=(np.nanmin(ref1arr),
stacked=True,edgecolor='black', alpha=0.8, linewidth=0.5, range=(np.nanmin(ref1arr), np.nanmax(ref1arr)), bins=10)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2021-02-12 11:53
              
            
            
                                                                       
When you plot a normalized histogram, it is not the height that should sum up to one, but the area underneath the curve should sum up to one:

In [44]:

import matplotlib.pyplot as plt
k=(3,3,3,3)
x, bins, p=plt.hist(k, density=True)  # used to be normed=True in older versions
from numpy import *
plt.xticks( arange(10) ) # 10 ticks on x axis
plt.show()  
In [45]:

print bins
[ 2.5  2.6  2.7  2.8  2.9  3.   3.1  3.2  3.3  3.4  3.5]


Here, this example, the bin width is 0.1, the area underneath the curve sums up to one (0.1*10).

To have the sum of height to be 1, add the following before plt.show():

for item in p:
    item.set_height(item.get_height()/sum(x))



                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2021-02-12 11:56
              
            
            
                                                                       
One way is to get the probabilities on your own, and then plot with plt.bar:

In [91]: from collections import Counter
    ...: c=Counter(k)
    ...: print c
Counter({1: 2, 3: 1, 4: 1})

In [92]: plt.bar(prob.keys(), prob.values())
    ...: plt.show()


result:

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2021-02-12 12:00
              
            
            
                                                                       
You could use the solution outlined here:

weights = np.ones_like(myarray)/float(len(myarray))
plt.hist(myarray, weights=weights)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复