this is a rather similar question to this question but with one key difference: I\'m selecting the data I want to change not by its index but by some criteria.
If th
Old question, but I'm surprised nobody mentioned numpy's .where()
functionality (which can be called directly from the pandas module).
In this case the code would be:
d.sales = pd.np.where(d.sales == 24, 100, d.sales)
To my knowledge, this is one of the fastest ways to conditionally change data across a series.
Many ways to do that
In [7]: d.sales[d.sales==24] = 100
In [8]: d
Out[8]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 12 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
In [26]: d.loc[d.sales == 12, 'sales'] = 99
In [27]: d
Out[27]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
In [28]: d.sales = d.sales.replace(23, 24)
In [29]: d
Out[29]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 24 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 24 2009
7 sun banana 100 2009
Not sure about older version of pandas, but in 0.16 the value of a particular cell can be set based on multiple column values.
Extending the answer provided by @waitingkuo, the same operation can also be done based on values of multiple columns.
d.loc[(d.day== 'sun') & (d.flavour== 'banana') & (d.year== 2009),'sales'] = 100