is seaborn confidence interval computed correctly?

前端 未结 2 1713
孤城傲影
孤城傲影 2021-02-04 12:09

First, I must admit that my statistics knowledge is rusty at best: even when it was shining new, it\'s not a discipline I particularly liked, which means I had a hard time makin

相关标签:
2条回答
  • 2021-02-04 12:26

    Your calculation using this Wikipedia formula is completely right. Seaborn just uses another method: https://en.wikipedia.org/wiki/Bootstrapping_(statistics). It's well described by Dragicevic [1]:

    [It] consists of generating many alternative datasets from the experimental data by randomly drawing observations with replacement. The variability across these datasets is assumed to approximate sampling error and is used to compute so-called bootstrap confidence intervals. [...] It is very versatile and works for many kinds of distributions.

    In the Seaborn's source code, a barplot uses estimate_statistic which bootstraps the data then computes the confidence interval on it:

    >>> sb.utils.ci(sb.algorithms.bootstrap(np.arange(100)))
    array([43.91, 55.21025])
    

    The result is consistent with your calculation.

    [1] Dragicevic, P. (2016). Fair statistical communication in HCI. In Modern Statistical Methods for HCI (pp. 291-330). Springer, Cham.

    0 讨论(0)
  • 2021-02-04 12:33

    You need to check the code of percentiles. The seaborn ci code you posted simply computes the percentile limits. This interval has a defined mean of 50 (median) and a default range of 95% confidence interval. The actual mean, the standard deviation, etc. will appear in the percentiles routine.

    0 讨论(0)
提交回复
热议问题