pd.qcut - ValueError: Bin edges must be unique

六眼飞鱼酱① 提交于 2019-12-10 11:38:57

问题


My data is here.

q = pd.qcut(df['loss_percent'], 10)

ValueError: Bin edges must be unique: array([ 0.38461538,  0.38461538,  0.46153846,  0.46153846,  0.53846154,
        0.53846154,  0.53846154,  0.61538462,  0.69230769,  0.76923077,  1.        ])

I have read through why-use-pandas-qcut-return-valueerror, however I am still confused.

I imagine that one of my values has a high frequency of occurrence and that is breaking qcut.

First, step is how do I determine if that is indeed the case, and which value is the problem. Lastly, what kind of solution is appropriate given my data.


回答1:


Using the solution in the post https://stackoverflow.com/a/36883735/2336654

def pct_rank_qcut(series, n):
    edges = pd.Series([float(i) / n for i in range(n + 1)])
    f = lambda x: (edges >= x).argmax()
    return series.rank(pct=1).apply(f)

q = pct_rank_qcut(df.loss_percent, 10)


来源:https://stackoverflow.com/questions/41475398/pd-qcut-valueerror-bin-edges-must-be-unique

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!