问题
I have a table which looks like this:
df_raw = pd.DataFrame(dict(A = pd.Series(['1.00','-1']), B = pd.Series(['1.0','-45.00','-'])))
A B
0 1.00 1.0
1 -1 -45.00
2 NaN -
I would like to replace '-' to '0.00' using dataframe.replace() but it struggles because of the negative values, '-1', '-45.00'.
How can I ignore the negative values and replace only '-' to '0.00' ?
my code:
df_raw = df_raw.replace(['-','\*'], ['0.00','0.00'], regex=True).astype(np.float64)
error code:
ValueError: invalid literal for float(): 0.0045.00
回答1:
Your regex is matching on all -
characters:
In [48]:
df_raw.replace(['-','\*'], ['0.00','0.00'], regex=True)
Out[48]:
A B
0 1.00 1.0
1 0.001 0.0045.00
2 NaN 0.00
If you put additional boundaries so that it only matches that single character with a termination then it works as expected:
In [47]:
df_raw.replace(['^-$'], ['0.00'], regex=True)
Out[47]:
A B
0 1.00 1.0
1 -1 -45.00
2 NaN 0.00
Here ^
means start of string and $
means end of string so it will only match on that single character.
Or you can just use replace
which will only match on exact matches:
In [29]:
df_raw.replace('-',0)
Out[29]:
A B
0 1.00 1.0
1 -1 -45.00
2 NaN 0
来源:https://stackoverflow.com/questions/32201222/pandas-dataframe-replace-with-regex