I have a DataFrame, and I want to replace the values in a particular column that exceed a value with zero. I had thought this was a way of achieving this:
df[
The reason your original dataframe does not update is because chained indexing may cause you to modify a copy rather than a view of your dataframe. The docs give this advice:
When setting values in a pandas object, care must be taken to avoid what is called chained indexing.
You have a few alternatives:-
loc
may be used for setting values and supports Boolean masks:
df.loc[df['my_channel'] > 20000, 'my_channel'] = 0
You can assign to your series:
df['my_channel'] = df['my_channel'].mask(df['my_channel'] > 20000, 0)
Or you can update your series in place:
df['my_channel'].mask(df['my_channel'] > 20000, 0, inplace=True)
You can use NumPy by assigning your original series when your condition is not satisfied; however, the first two solutions are cleaner since they explicitly change only specified values.
df['my_channel'] = np.where(df['my_channel'] > 20000, 0, df['my_channel'])