Plotting a histogram with overlaid PDF

后端未结

关注

 1  1867

This is a follow-up to my previous couple of questions. Here\'s the code I\'m playing with:

import pandas as pd
import matplotlib.pyplot as plt
import scipy.stat


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2021-01-21 04:42
              
            
            
                                                                       
You should plot the histogram with density=True if you hope to compare it to a true PDF. Otherwise your normalization (amplitude) will be off.

Also, you need to specify the x-values (as an ordered array) when you plot the pdf:

fig, ax = plt.subplots()

df2[df2[column] > -999].hist(column, alpha = 0.5, density=True, ax=ax)

param = stats.norm.fit(df2[column].dropna())
x = np.linspace(*df2[column].agg([min, max]), 100) # x-values

plt.plot(x, stats.norm.pdf(x, *param), color = 'r')
plt.show()






As an aside, using a histogram to compare continuous variables with a distribution is isn't always the best. (Your sample data are discrete, but the link uses a continuous variable).  The choice of bins can alias the shape of your histogram, which may lead to incorrect inference. Instead, the ECDF is a much better (choice-free) illustration of the distribution for a continuous variable:

def ECDF(data):
    n = sum(data.notnull())
    x = np.sort(data.dropna())
    y = np.arange(1, n+1) / n
    return x,y

fig, ax = plt.subplots()

plt.plot(*ECDF(df2.loc[df2[column] > -999, 'B']), marker='o')

param = stats.norm.fit(df2[column].dropna())
x = np.linspace(*df2[column].agg([min, max]), 100) # x-values

plt.plot(x, stats.norm.cdf(x, *param), color = 'r')
plt.show()



                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复