I have a pandas dataFrame of mixed types, some are strings and some are numbers. I would like to replace the NAN values in string columns by \'.\', and the NAN values in flo
define a function:
def myfillna(series):
if series.dtype is pd.np.dtype(float):
return series.fillna(0)
elif series.dtype is pd.np.dtype(object):
return series.fillna('.')
else:
return series
you can add other elif statements if you want to fill a column of a different dtype in some other way. Now apply this function over all columns of the dataframe
df = df.apply(myfillna)
this is the same as 'inplace'
There is a simpler way, that can be done in one line:
df.fillna({'Name':0,'City':0},inplace=True)
Not an awesome improvement but if you multiply it by 100, writting only the column names + ':0' is way faster than copying and pasting everything 100 times.
You could use apply for your columns with checking dtype
whether it's numeric
or not by checking dtype.kind:
res = df.apply(lambda x: x.fillna(0) if x.dtype.kind in 'biufc' else x.fillna('.'))
print(res)
A B City Name
0 1.0 0.25 Seattle Jack
1 2.1 0.00 SF Sue
2 0.0 0.00 LA .
3 4.7 4.00 OC Bob
4 5.6 12.20 . Alice
5 6.8 14.40 . John
Much easy way is :dt.replace(pd.np.nan, "NA")
.
In case you want other replacement, you should use the next:dt.replace("pattern", "replaced by (new pattern)")
Came across this page while looking for an answer to this problem, but didn't like the existing answers. I ended up finding something better in the DataFrame.fillna documentation, and figured I'd contribute for anyone else that happens upon this.
If you have multiple columns, but only want to replace the NaN
in a subset of them, you can use:
df.fillna({'Name':'.', 'City':'.'}, inplace=True)
This also allows you to specify different replacements for each column. And if you want to go ahead and fill all remaining NaN
values, you can just throw another fillna
on the end:
df.fillna({'Name':'.', 'City':'.'}, inplace=True).fillna(0, inplace=True)
You can either list the string columns by hand or glean them from df.dtypes
. Once you have the list of string/object columns, you can call fillna
on all those columns at once.
# str_cols = ['Name','City']
str_cols = df.columns[df.dtypes==object]
df[str_cols] = df[str_cols].fillna('.')
df = df.fillna(0)