boolean indexing that can produce a view to a large pandas dataframe?

前端 未结 3 1035
暖寄归人
暖寄归人 2021-02-04 08:52

Got a large dataframe that I want to take slices of (according to multiple boolean criteria), and then modify the entries in those slices in order to change the original datafra

3条回答
  •  醉话见心
    2021-02-04 09:26

    Even though df.loc[idx] may be a copy of a portion of df, assignment to df.loc[idx] modifies df itself. (This is also true of df.iloc and df.ix.)

    For example,

    import pandas as pd
    import numpy as np
    df = pd.DataFrame({'A':[9,10]*6,
                       'B':range(23,35),
                       'C':range(-6,6)})
    
    print(df)
    #      A   B  C
    # 0    9  23 -6
    # 1   10  24 -5
    # 2    9  25 -4
    # 3   10  26 -3
    # 4    9  27 -2
    # 5   10  28 -1
    # 6    9  29  0
    # 7   10  30  1
    # 8    9  31  2
    # 9   10  32  3
    # 10   9  33  4
    # 11  10  34  5
    

    Here is our boolean index:

    idx = (df['C']!=0) & (df['A']==10) & (df['B']<30)
    

    We can modify those rows of df where idx is True by assigning to df.loc[idx, ...]. For example,

    df.loc[idx, 'A'] += df.loc[idx, 'B'] * df.loc[idx, 'C']
    print(df)
    

    yields

          A   B  C
    0     9  23 -6
    1  -110  24 -5
    2     9  25 -4
    3   -68  26 -3
    4     9  27 -2
    5   -18  28 -1
    6     9  29  0
    7    10  30  1
    8     9  31  2
    9    10  32  3
    10    9  33  4
    11   10  34  5
    

提交回复
热议问题