I have tried many times, but seems the \'replace\' can NOT work well after use \'loc\'. For example I want to replace the \'conlumn_b\' with an regex for the row that the \'
I'm going to borrow from a recent answer of mine. This technique is a general purpose strategy for updating a dataframe in place:
df.update(
df.loc[df['conlumn_a'] == 'apple', 'conlumn_b']
.replace(r'^11$', 'XXX', regex=True)
)
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Note that all I did was remove the inplace=True
and instead wrapped it in the pd.DataFrame.update method.
inplace=True
works on the object that it was applied on.
When you call .loc
, you're slicing your dataframe object to return a new one.
>>> id(df)
4587248608
And,
>>> id(df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'])
4767716968
Now, calling an in-place replace
on this new slice will apply the replace operation, updating the new slice itself, and not the original.
Now, note that you're calling replace
on a column of int
, and nothing is going to happen, because regular expressions work on strings.
Here's what I offer you as a workaround. Don't use regex at all.
m = df['conlumn_a'] == 'apple'
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b'].replace(11, 'XXX')
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Or, if you need regex based substitution, then -
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b']\
.astype(str).replace('^11$', 'XXX', regex=True)
Although, this converts your column to an object column.
I think you need filter in both sides:
m = df['conlumn_a'] == 'apple'
df.loc[m,'conlumn_b'] = df.loc[m,'conlumn_b'].astype(str).replace(r'^(11+)','XXX',regex=True)
print (df)
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33