How to perform two-sample one-tailed t-test with numpy/scipy

前端未结

关注

 5  998

In R, it is possible to perform two-sample one-tailed t-test simply by using

> A = c(0.19826790, 1.36836629, 1.37950911, 1.46951540, 1.481977


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  你的背包        
                
              
                            
                2020-12-07 21:02
              
            
            
                                                                       
After trying to add some insights as comments to the accepted answer but not being able to properly write them down due to general restrictions upon comments, I decided to put my two cents in as a full answer.

First let's formulate our investigative question properly. The data we are investigating is

A = np.array([0.19826790, 1.36836629, 1.37950911, 1.46951540, 1.48197798, 0.07532846])
B = np.array([0.6383447, 0.5271385, 1.7721380, 1.7817880])


with the sample means

A.mean() = 0.99549419
B.mean() = 1.1798523


I assume that since the mean of B is obviously greater than the mean of A, you would like to check if this result is statistically significant.

So we have the Null Hypothesis

H0: A >= B


that we would like to reject in favor of the Alternative Hypothesis

H1: B > A


Now when you call scipy.stats.ttest_ind(x, y), this makes a Hypothesis Test on the value of x.mean()-y.mean(),  which means that in order to get positive values throughout the calculation (which simplifies all considerations) we have to call 

stats.ttest_ind(B,A)


instead of stats.ttest_ind(B,A). We get as an answer


t-value = 0.42210654140239207
p-value = 0.68406235191764142


and since according to the documentation this is the output for a two-tailed t-test we must divide the p by 2 for our one-tailed test. So depending on the Significance Level alpha you have chosen you need

p/2 < alpha


in order to reject the Null Hypothesis H0. For alpha=0.05 this is clearly not the case so you cannot reject H0.

An alternative way to decide if you reject H0 without having to do any algebra on t or p is by looking at the t-value and comparing it with the critical t-value t_crit at the desired level of confidence (e.g. 95%) for the number of degrees of freedom df that applies to your problem. Since we have

df = sample_size_1 + sample_size_2 - 2 = 8


we get from a statistical table like this one that

t_crit(df=8, confidence_level=95%) = 1.860


We clearly have

t < t_crit


so we obtain again the same result, namely that we cannot reject H0.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一个人的身影        
                
              
                            
                2020-12-07 21:04
              
            
            
                                                                       
When null hypothesis is Ho: P1>=P2 and alternative hypothesis is Ha: P1<P2. In order to test it in Python, you write ttest_ind(P2,P1). (Notice the position is P2 first). 

first = np.random.normal(3,2,400)
second = np.random.normal(6,2,400)
stats.ttest_ind(first, second, axis=0, equal_var=True)


You will get the result like below
Ttest_indResult(statistic=-20.442436213923845,pvalue=5.0999336686332285e-75)

In Python, when statstic <0 your real p-value is actually real_pvalue = 1-output_pvalue/2= 1-5.0999336686332285e-75/2, which is approximately 0.99. As your p-value is larger than 0.05, you cannot reject the null hypothesis that 6>=3. when statstic >0, the real z score is actually equal to -statstic, the real p-value is equal to pvalue/2.

Ivc's answer should be when (1-p/2) < alpha and t < 0, you can reject the less than hypothesis. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  爱一瞬间的悲伤        
                
              
                            
                2020-12-07 21:04
              
            
            
                                                                       
Did you look at this:
How to calculate the statistics "t-test" with numpy

I think that is exactly what this questions is looking at.

Basically:

import scipy.stats
x = [1,2,3,4]
scipy.stats.ttest_1samp(x, 0)

Ttest_1sampResult(statistic=3.872983346207417, pvalue=0.030466291662170977)


is the same result as this example in R. https://stats.stackexchange.com/questions/51242/statistical-difference-from-zero
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光说笑        
                
              
                            
                2020-12-07 21:10
              
            
            
                                                                       
From your mailing list link:


  because the one-sided tests can  be backed out from the two-sided
  tests. (With symmetric distributions  one-sided p-value is just half
  of the two-sided pvalue)


It goes on to say that scipy always gives the test statistic as signed. This means that given p and t values from a two-tailed test, you would reject the null hypothesis of a greater-than test when p/2 < alpha and t > 0, and of a less-than test when p/2 < alpha and t < 0.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-12-07 21:21
              
            
            
                                                                       
    from scipy.stats import ttest_ind  
    
    def t_test(x,y,alternative='both-sided'):
            _, double_p = ttest_ind(x,y,equal_var = False)
            if alternative == 'both-sided':
                pval = double_p
            elif alternative == 'greater':
                if np.mean(x) > np.mean(y):
                    pval = double_p/2.
                else:
                    pval = 1.0 - double_p/2.
            elif alternative == 'less':
                if np.mean(x) < np.mean(y):
                    pval = double_p/2.
                else:
                    pval = 1.0 - double_p/2.
            return pval

    A = [0.19826790, 1.36836629, 1.37950911, 1.46951540, 1.48197798, 0.07532846]
    B = [0.6383447, 0.5271385, 1.7721380, 1.7817880]

    print(t_test(A,B,alternative='greater'))
    0.6555098817758839

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复