How to calculate p-value for two lists of floats?

前端 未结 1 407
轮回少年
轮回少年 2021-02-04 06:53

So I have lists of floats. Like [1.33,2.555,3.2134,4.123123] etc. Those lists are mean frequencies of something. How do I proof that two lists are different? I thou

1条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-02-04 07:01

    Let's say you have a list of floats like this:

    >>> data = {
    ...     'a': [0.9, 1.0, 1.1, 1.2],
    ...     'b': [0.8, 0.9, 1.0, 1.1],
    ...     'c': [4.9, 5.0, 5.1, 5.2],
    ... }
    

    Clearly, a is very similar to b, but both are different from c.

    There are two kinds of comparisons you may want to do.

    1. Pairwise: Is a similar to b? Is a similar to c? Is b similar to c?
    2. Combined: Are a, b and c drawn from the same group? (This is generally a better question)

    The former can be achieved using independent t-tests as follows:

    >>> from itertools import combinations
    >>> from scipy.stats import ttest_ind
    >>> for list1, list2 in combinations(data.keys(), 2):
    ...     t, p = ttest_ind(data[list1], data[list2])
    ...     print list1, list2, p
    ...
    a c 9.45895002589e-09
    a b 0.315333596201
    c b 8.15963804843e-09
    

    This provides the relevant p-values, and implies that that a and c are different, b and c are different, but a and b may be similar.

    The latter can be achieved using the one-way ANOVA as follows:

    >>> from scipy.stats import f_oneway
    >>> t, p =  f_oneway(*data.values())
    >>> p
    7.959305946160327e-12
    

    The p-value indicates that a, b, and c are unlikely to be from the same population.

    0 讨论(0)
提交回复
热议问题