Distribution of outcomes in dice experiments

前端未结

关注

 3  2086

So I wrote a short Python function to plot distribution outcome of dice experiments. It\'s working fine but when I run for example dice(1,5000) or dice(10


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  日久生厌        
                
              
                            
                2020-12-21 06:48
              
            
            
                                                                       
Your plot is only showing 5 bars - the bar is to the right of the number, so I believe the results for 5 and 6 are being combined. If you change to range(1,8) you see more of what you expect.


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  执笔经年        
                
              
                            
                2020-12-21 06:55
              
            
            
                                                                       
If you are lazy (like me), you can also use numpy to directly generate a matrix and seaborn to deal with bins for you: 

import numpy as np
import seaborn as sns

dices = 1000
throws = 5000
x = np.random.randint(6, size=(dices, throws)) + 1
sns.distplot(x)


Which gives: 



Seaborn usually make good choices, which can save a bit of time in configuration. That's worth a try at least. You can also use the kde=False option on the seaborn plot to get rid of the density estimate. 

Just for the sake of it and to show how seaborn behave, the same with the sum over 100 dices:

dices = 100
throws = 5000
x = np.random.randint(6, size=(dices, throws)) + 1
sns.distplot(x.sum(axis=0), kde=False)



                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  Happy的楠姐        
                
              
                            
                2020-12-21 06:58
              
            
            
                                                                       
According to a sample of your code, the issue is a plotting problem, not a computational one, which is why you are seeing the correct mean. As you can see, the following image shows five bars, the last one being twice the size of the others:



Notice also that the bars are labeled on the left, and there is therefore no "6" bar. This has to do with what plt.hist means by bins:


  If bins is a sequence, it defines the bin edges, including the left edge of the first bin and the right edge of the last bin; in this case, bins may be unequally spaced. All but the last (righthand-most) bin is half-open.


So to specify bin edges, you probably want something more like

plt.hist(np.ravel(result), bins=np.arange(0.5, 7.5, 1))


And the result:



Unasked Questions

If you want to simulate N * n data points, you can use numpy directly. Replace your original initialization of result and the for loop with any of the following lines:

result = (np.random.uniform(size=(n, N)) * 6 + 1).astype(int)
result = np.random.uniform(1.0. 7.0, size=(n, N)).astype(int)
result = np.random.randint(1, 7, size=(n, N))


The last line is preferable in terms of efficiency and accuracy.

Another possible improvement is in how you compute the histogram. Right now, you are using plt.hist, which calls np.histogram and plt.bar. For small integers like you have, np.bincount is arguably a much better binning technique:

count = np.bincount(result.ravel())[1:]
plt.bar(np.arange(1, 7), count)


Notice that this also simplifies the plotting since you specify the centers of the bars directly, instead of having plt.hist guess it for you.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复