Pandas describe vs scipy.stats percentileofscore with NaN?

后端未结

关注

 2  1532

被撕碎了的回忆 2021-01-29 07:54

I\'m having a weird situation, where pd.describe is giving me percentile markers that disagree with scipy.stats percentileofscore, because of NaNs, I think.

My df is:

2条回答

孤独总比滥情好 (楼主)

2021-01-29 07:56

the answer is very simple.

There is no universally accepted formula for computing percentiles, in particular when your data contains ties or when it cannot be perfectly broken down in equal-size buckets.

For instance, have a look at the documentation in R. There are more than seven types of formulas! https://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html

At the end, it comes down to understanding which formula is used and whether the differences are big enough to be a problem in your case.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...