applying pandas cut within a groupby

故事扮演 提交于 2021-02-18 18:13:39

问题


I'm trying to create bins (A_bin) within a DataFrame based on one column (A), and then create unique bins (B_bin) based on another column (B) within each of the original bins.

df = pd.DataFrame({'A': [4.5, 5.1, 5.9, 6.3, 6.7, 7.5, 7.9, 8.5, 8.9, 9.3, 9.9, 10.3, 10.9, 11.1, 11.3, 11.9],
                        'B': [3.2, 2.7, 2.2, 3.3, 2.1, 1.8, 1.4, 1.0, 1.8,2.4, 1.2, 0.8, 1.4, 0.6, 0, -0.4]})
df['A_bin'] = pd.cut(df['A'], bins=3)
df['B_bin'] = df.groupby('A_bin')['B'].transform(lambda x: pd.cut(x, bins=2)) 

This results in:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-341-5742137b7574> in <module>()
      2                         'B': [3.2, 2.7, 2.2, 3.3, 2.1, 1.8, 1.4, 1.0, 1.8,2.4, 1.2, 0.8, 1.4, 0.6, 0, -0.4]})
      3 df['A_bin'] = pd.cut(df['A'], bins=3)
----> 4 df['B_bin'] = df.groupby('A_bin')['B'].transform(lambda x: pd.cut(x, bins=2))

C:\Users\ddecker1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\groupby.py in transform(self, func, *args, **kwargs)
   2761 
   2762             indexer = self._get_index(name)
-> 2763             result[indexer] = res
   2764 
   2765         result = _possibly_downcast_to_dtype(result, dtype)

ValueError: could not convert string to float: '(2.0988, 2.7]'

It looks like it's trying to do the right thing, but I'm not sure why it's trying to change the the string to float.


回答1:


It's a kind of magic:

df.groupby('A_bin')[['B']].transform(lambda x: pd.cut(x, bins=2))


来源:https://stackoverflow.com/questions/42265471/applying-pandas-cut-within-a-groupby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!