问题
I want to get interval margins of a column with pandas intervals and write them in columns 'left', 'right'. Iterrows does not work (documentation says it would not be use for writing data) and, anyway it would not be the better solution.
import pandas as pd
i1 = pd.Interval(left=85, right=94)
i2 = pd.Interval(left=95, right=104)
i3 = pd.Interval(left=105, right=114)
i4 = pd.Interval(left=115, right=124)
i5 = pd.Interval(left=125, right=134)
i6 = pd.Interval(left=135, right=144)
i7 = pd.Interval(left=145, right=154)
i8 = pd.Interval(left=155, right=164)
i9 = pd.Interval(left=165, right=174)
data = pd.DataFrame(
{
"intervals":[i1,i2,i3,i4,i5,i6,i7,i8,i9],
"left" :[0,0,0,0,0,0,0,0,0],
"right" :[0,0,0,0,0,0,0,0,0]
},
index=[0,1,2,3,4,5,6,7,8]
)
#this is not working (has no effect):
for index, row in data.iterrows():
print(row.intervals.left, row.intervals.right)
row.left = row.intervals.left
row.right = row.intervals.right
How can we do something like:
data['left']=data['intervals'].left
data['right']=data['intervals'].right
Thanks!
回答1:
Create an pandas.IntervalIndex from your intervals. You can then access the .left
and .right
attributes.
import pandas as pd
idx = pd.IntervalIndex([i1, i2, i3, i4, i5, i6, i7, i8, i9])
pd.DataFrame({'intervals': idx, 'left': idx.left, 'right': idx.right})
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
Another option is using map
and operator.attrgetter
(look ma, no lambda
...):
from operator import attrgetter
df['left'] = df['intervals'].map(attrgetter('left'))
df['right'] = df['intervals'].map(attrgetter('right'))
df
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
回答2:
A pandas.arrays.IntervalArray, is the preferred way for storing interval data in Series
-like structures.
For @coldspeed's first example, IntervalArray
is basically a drop in replacement:
In [2]: pd.__version__
Out[2]: '1.1.3'
In [3]: ia = pd.arrays.IntervalArray([i1, i2, i3, i4, i5, i6, i7, i8, i9])
In [4]: df = pd.DataFrame({'intervals': ia, 'left': ia.left, 'right': ia.right})
In [5]: df
Out[5]:
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
If you already have interval data in a Series
or DataFrame
, @coldspeed's second example becomes a bit more simple by accessing the array
attribute:
In [6]: df = pd.DataFrame({'intervals': ia})
In [7]: df['left'] = df['intervals'].array.left
In [8]: df['right'] = df['intervals'].array.right
In [9]: df
Out[9]:
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
2 (105, 114] 105 114
3 (115, 124] 115 124
4 (125, 134] 125 134
5 (135, 144] 135 144
6 (145, 154] 145 154
7 (155, 164] 155 164
8 (165, 174] 165 174
回答3:
A simple way is to use apply() method:
data['left'] = data['intervals'].apply(lambda x: x.left)
data['right'] = data['intervals'].apply(lambda x: x.right)
data
intervals left right
0 (85, 94] 85 94
1 (95, 104] 95 104
...
8 (165, 174] 165 174
来源:https://stackoverflow.com/questions/53996015/extract-left-and-right-limit-from-a-series-of-pandas-intervals