I\'d like to replace bad values in a column of a dataframe by NaN\'s.
mydata = {\'x\' : [10, 50, 18, 32, 47, 20], \'y\' : [\'12\', \'11\', \'N/A\', \'13\', \
just use replace
:
In [106]:
df.replace('N/A',np.NaN)
Out[106]:
x y
0 10 12
1 50 11
2 18 NaN
3 32 13
4 47 15
5 20 NaN
What you're trying is called chain indexing: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
You can use loc
to ensure you operate on the original dF:
In [108]:
df.loc[df['y'] == 'N/A','y'] = np.nan
df
Out[108]:
x y
0 10 12
1 50 11
2 18 NaN
3 32 13
4 47 15
5 20 NaN
df.loc[df.y == 'N/A',['y']] = np.nan
This solve your problem. With the double [], you are working on a copy of the DataFrame. You have to specify exact location in one call to be able to modify it.