问题
I'm trying to create a new column in a dataframe that labels animals that are domesticated with a 1. I'm using a for loop, but for some reason, the loop only picks up the last item in the pets
list. dog
, cat
, and gerbil
should all be assigned a 1 under the domesticated
column. Anyone have a fix for this or a better approach?
df = pd.DataFrame(
{'creature': ['dog', 'cat', 'gerbil', 'mouse', 'donkey']
})
pets = ['dog', 'cat', 'gerbil']
for pet in pets:
df['domesticated'] = np.where(df['creature']==pet, 1, 0)
df
回答1:
You are setting all non gerbil to 0
in your last loop iteration. That is, when pet
is gerbil
in your last iteration, ALL entries that are not equal to gerbil
will correspond to 0
. This includes entries that are dog
or cat
. You should check all values in pets
at once. Try this:
df['domesticated'] = df['creature'].apply(lambda x: 1 if x in pets else 0)
If you want to stick with np.where
:
df['domesticated'] = np.where(df['creature'].isin(pets), 1, 0)
回答2:
The problem is every loop resets your results.
df['domesticated'] = df.isin(pets).astype(int)
creature domesticated
0 dog 1
1 cat 1
2 gerbil 1
3 mouse 0
4 donkey 0
来源:https://stackoverflow.com/questions/55271517/for-loop-using-np-where