发表新帖

发表新帖

Mapping values in place (for example with Gender) from string to int in Pandas dataframe [duplicate]

后端未结

关注

 3  868

难免孤独 2021-01-23 14:45

3条回答

后悔当初 (楼主)

2021-01-23 15:23
My instinct would have suggested to use .map(), but I made a comparison between your solution and map, based on a dataframe with 1500 random male/female values.
```
%timeit df_base['Sex_new'] = df_base['Sex'].map({'male': 0,'female': 1})
1000 loops, best of 3: 653 µs per loop
```
Edited Based on coldspeeds comment, and because reassigning it is a better comparison with the others:
```
%timeit df_base['Sex_new'] = df_base['Sex'].replace(['male','female'],[0,1])
1000 loops, best of 3: 968 µs per loop
```
So actually slower .map()...!

~~So based on this example, your 'shoddy' solution seems faster than .map()...~~

Edit

pygo's solution:
```
%timeit df_base['Sex_new'] = np.where(df_base['Sex'] == 'male', 0, 1)
1000 loops, best of 3: 331 µs per loop
```
So faster!

Jezrael's solution with .astype(int):
```
%timeit df_base['Sex_new'] = (df_base['Sex'] == 'female').astype(int)
1000 loops, best of 3: 388 µs per loop
```
So also faster than .map() and .replace().
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题