I\'m having a weird situation, where pd.describe is giving me percentile markers that disagree with scipy.stats percentileofscore, because of NaNs, I think.
My df is:
the answer is very simple.
There is no universally accepted formula for computing percentiles, in particular when your data contains ties or when it cannot be perfectly broken down in equal-size buckets.
For instance, have a look at the documentation in R
. There are more than seven types of formulas! https://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html
At the end, it comes down to understanding which formula is used and whether the differences are big enough to be a problem in your case.