Pandas how can 'replace' work after 'loc'?

前端 未结 3 738
终归单人心
终归单人心 2020-12-03 19:35

I have tried many times, but seems the \'replace\' can NOT work well after use \'loc\'. For example I want to replace the \'conlumn_b\' with an regex for the row that the \'

相关标签:
3条回答
  • 2020-12-03 20:06

    I'm going to borrow from a recent answer of mine. This technique is a general purpose strategy for updating a dataframe in place:

    df.update(
        df.loc[df['conlumn_a'] == 'apple', 'conlumn_b']
          .replace(r'^11$', 'XXX', regex=True)
    )
    
    df
    
      conlumn_a conlumn_b
    0     apple       123
    1    banana        11
    2     apple       XXX
    3    orange        33
    

    Note that all I did was remove the inplace=True and instead wrapped it in the pd.DataFrame.update method.

    0 讨论(0)
  • 2020-12-03 20:17

    inplace=True works on the object that it was applied on.

    When you call .loc, you're slicing your dataframe object to return a new one.

    >>> id(df)
    4587248608
    

    And,

    >>> id(df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'])
    4767716968
    

    Now, calling an in-place replace on this new slice will apply the replace operation, updating the new slice itself, and not the original.


    Now, note that you're calling replace on a column of int, and nothing is going to happen, because regular expressions work on strings.

    Here's what I offer you as a workaround. Don't use regex at all.

    m = df['conlumn_a'] == 'apple'
    df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b'].replace(11, 'XXX')
    
    df
    
      conlumn_a conlumn_b
    0     apple       123
    1    banana        11
    2     apple       XXX
    3    orange        33
    

    Or, if you need regex based substitution, then -

    df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b']\
               .astype(str).replace('^11$', 'XXX', regex=True)
    

    Although, this converts your column to an object column.

    0 讨论(0)
  • 2020-12-03 20:18

    I think you need filter in both sides:

    m = df['conlumn_a'] == 'apple'
    df.loc[m,'conlumn_b'] = df.loc[m,'conlumn_b'].astype(str).replace(r'^(11+)','XXX',regex=True)
    print (df)
      conlumn_a conlumn_b
    0     apple       123
    1    banana        11
    2     apple       XXX
    3    orange        33
    
    0 讨论(0)
提交回复
热议问题