boolean indexing that can produce a view to a large pandas dataframe?

前端未结

关注

 3  1040

Got a large dataframe that I want to take slices of (according to multiple boolean criteria), and then modify the entries in those slices in order to change the original datafra

相关标签:

3条回答

醉话见心

2021-02-04 09:26

Even though df.loc[idx] may be a copy of a portion of df, assignment to df.loc[idx] modifies df itself. (This is also true of df.iloc and df.ix.)

For example,

import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[9,10]*6,
                   'B':range(23,35),
                   'C':range(-6,6)})

print(df)
#      A   B  C
# 0    9  23 -6
# 1   10  24 -5
# 2    9  25 -4
# 3   10  26 -3
# 4    9  27 -2
# 5   10  28 -1
# 6    9  29  0
# 7   10  30  1
# 8    9  31  2
# 9   10  32  3
# 10   9  33  4
# 11  10  34  5

Here is our boolean index:

idx = (df['C']!=0) & (df['A']==10) & (df['B']<30)

We can modify those rows of df where idx is True by assigning to df.loc[idx, ...]. For example,

df.loc[idx, 'A'] += df.loc[idx, 'B'] * df.loc[idx, 'C']
print(df)

yields

      A   B  C
0     9  23 -6
1  -110  24 -5
2     9  25 -4
3   -68  26 -3
4     9  27 -2
5   -18  28 -1
6     9  29  0
7    10  30  1
8     9  31  2
9    10  32  3
10    9  33  4
11   10  34  5

0 讨论(0)

不要未来只要你来

2021-02-04 09:27

Building off of unutbu's example you could also use the boolean index on df.index like so:

In [11]: df.ix[df.index[idx]] = 999

In [12]: df
Out[12]:
      A    B    C
0     9   23   -6
1   999  999  999
2     9   25   -4
3   999  999  999
4     9   27   -2
5   999  999  999
6     9   29    0
7    10   30    1
8     9   31    2
9    10   32    3
10    9   33    4
11   10   34    5

0 讨论(0)

醉话见心

2021-02-04 09:34

The pandas docs have a section on Returning a view versus a copy:

The rules about when a view on the data is returned are entirely dependent on NumPy. Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy. With single label / scalar indexing and slicing, e.g. df.ix[3:6] or df.ix[:, 'A'], a view will be returned.

0 讨论(0)
发布评论:

提交评论
- 加载中...