How does pandas calculate quartiles?

十年热恋 提交于 2019-12-11 19:27:22

问题


I have a very simple dataframe:

df = pd.DataFrame([5,7,10,15,19,21,21,22,22,23,23,23,23,23,24,24,24,24,25], columns=['val'])

df.median() = 23 which is right because from 19 values in the list, 23 is 10th value (9 values before 23, and 9 values after 23)

I tried to calculate 1st and 3rt quartile as:

df.quantile([.25, .75])

         val
0.25    20.0
0.75    23.5

I would have expected that from 9 values bellow median that 1st quartile should be 19, but as you can see above, python says it is 20. Similarly, for 3rd quartile, fifth number from right to left is 24, but python shows 23.5.

How does pandas calculates quartile?

Original question is from the following link: https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/box-whisker-plots/a/identifying-outliers-iqr-rule


回答1:


Python doesn't create the quantile, Pandas does. Here take a look at the documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.quantile.html It actually uses numpy's percentile function https://docs.scipy.org/doc/numpy/reference/generated/numpy.percentile.html#numpy.percentile




回答2:


It uses linear interpolation by default. Here's how to use nearest instead:

df['val'].quantile([0.25, 0.75], interpolation='nearest')

Out:
0.25    19
0.75    24

More info from the official documentation on how the interpolation parameter works:

    This optional parameter specifies the interpolation method to use,
    when the desired quantile lies between two data points `i` and `j`:

    * linear: `i + (j - i) * fraction`, where `fraction` is the
      fractional part of the index surrounded by `i` and `j`.
    * lower: `i`.
    * higher: `j`.
    * nearest: `i` or `j` whichever is nearest.
    * midpoint: (`i` + `j`) / 2.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.quantile.html



来源:https://stackoverflow.com/questions/55009203/how-does-pandas-calculate-quartiles

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!