scipy p-value returns 0.0

问题

Using a 2 sample Kolmogorov Smirnov test, I am getting a p-value of 0.0.

>>>scipy.stats.ks_2samp(dataset1, dataset2)
(0.65296076312083573, 0.0)

Looking at the histograms of the 2 datasets, I am quite confident they represent two different datasets. But, really, p = 0.0? That doesn't seem to make sense. Shouldn't it be a very small but positive number?

I know the return value is of type numpy.float64. Does that have something to do with it?

EDIT: data here: https://www.dropbox.com/s/jpixhz0pcybyh1t/data4stack.csv

scipy.version.full_version
'0.13.2'

回答1:

Yes, the probability is very small:

>>> from pprint import pprint
>>> pprint ([(i, scipy.stats.ks_2samp(dataset1, dataset2[:i])[1]) 
...                for i in range(200,len(dataset2),200)])
[(200, 3.1281733251275881e-63),
 (400, 3.5780609056448825e-157),
 (600, 9.2884803664366062e-225),
 (800, 7.1429666685167604e-293),
 (1000, 0.0),
 (1200, 0.0),
 (1400, 0.0),
 (1600, 0.0),
 (1800, 0.0),
 (2000, 0.0),
 (2200, 0.0),
 (2400, 0.0)]

来源：https://stackoverflow.com/questions/20530138/scipy-p-value-returns-0-0

标签

python

statistics

scipy

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!