GroupBy pandas DataFrame and select most common value

后端未结

关注

 10  1728

梦谈多话 2020-11-22 07:59

I have a data frame with three string columns. I know that the only one value in the 3rd column is valid for every combination of the first two. To clean the data I have to

10条回答

悲&欢浪女 (楼主)

2020-11-22 08:36
If you don't want to include NaN values, using Counter is much much faster than pd.Series.mode or pd.Series.value_counts()[0]:
```
def get_most_common(srs):
    x = list(srs)
    my_counter = Counter(x)
    return my_counter.most_common(1)[0][0]

df.groupby(col).agg(get_most_common)
```
should work. This will fail when you have NaN values, as each NaN will be counted separately.
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...