Extract left and right limit from a Series of pandas Intervals

前端 未结 3 1878
终归单人心
终归单人心 2021-01-02 13:42

I want to get interval margins of a column with pandas intervals and write them in columns \'left\', \'right\'. Iterrows does not work (documentation says it would not be us

相关标签:
3条回答
  • 2021-01-02 14:17

    A simple way is to use apply() method:

        data['left'] = data['intervals'].apply(lambda x: x.left)
        data['right'] = data['intervals'].apply(lambda x: x.right)
        data
    
        intervals      left     right
        0   (85, 94]     85      94
        1   (95, 104]    95     104
        ...
        8   (165, 174]  165     174
    
    0 讨论(0)
  • 2021-01-02 14:19

    Create an pandas.IntervalIndex from your intervals. You can then access the .left and .right attributes.

    import pandas as pd
    
    idx = pd.IntervalIndex([i1, i2, i3, i4, i5, i6, i7, i8, i9])  
    pd.DataFrame({'intervals': idx, 'left': idx.left, 'right': idx.right})
    
        intervals  left  right
    0    (85, 94]    85     94
    1   (95, 104]    95    104
    2  (105, 114]   105    114
    3  (115, 124]   115    124
    4  (125, 134]   125    134
    5  (135, 144]   135    144
    6  (145, 154]   145    154
    7  (155, 164]   155    164
    8  (165, 174]   165    174
    

    Another option is using map and operator.attrgetter (look ma, no lambda...):

    from operator import attrgetter
    
    df['left'] = df['intervals'].map(attrgetter('left'))
    df['right'] = df['intervals'].map(attrgetter('right'))
    
    df
        intervals left right
    0    (85, 94]   85    94
    1   (95, 104]   95   104
    2  (105, 114]  105   114
    3  (115, 124]  115   124
    4  (125, 134]  125   134
    5  (135, 144]  135   144
    6  (145, 154]  145   154
    7  (155, 164]  155   164
    8  (165, 174]  165   174
    
    0 讨论(0)
  • 2021-01-02 14:32

    A pandas.arrays.IntervalArray, is the preferred way for storing interval data in Series-like structures.

    For @coldspeed's first example, IntervalArray is basically a drop in replacement:

    In [2]: pd.__version__
    Out[2]: '1.1.3'
    
    In [3]: ia = pd.arrays.IntervalArray([i1, i2, i3, i4, i5, i6, i7, i8, i9])
    
    In [4]: df = pd.DataFrame({'intervals': ia, 'left': ia.left, 'right': ia.right})
    
    In [5]: df
    Out[5]:
        intervals  left  right
    0    (85, 94]    85     94
    1   (95, 104]    95    104
    2  (105, 114]   105    114
    3  (115, 124]   115    124
    4  (125, 134]   125    134
    5  (135, 144]   135    144
    6  (145, 154]   145    154
    7  (155, 164]   155    164
    8  (165, 174]   165    174
    

    If you already have interval data in a Series or DataFrame, @coldspeed's second example becomes a bit more simple by accessing the array attribute:

    In [6]: df = pd.DataFrame({'intervals': ia})
    
    In [7]: df['left'] = df['intervals'].array.left
    
    In [8]: df['right'] = df['intervals'].array.right
    
    In [9]: df
    Out[9]:
        intervals  left  right
    0    (85, 94]    85     94
    1   (95, 104]    95    104
    2  (105, 114]   105    114
    3  (115, 124]   115    124
    4  (125, 134]   125    134
    5  (135, 144]   135    144
    6  (145, 154]   145    154
    7  (155, 164]   155    164
    8  (165, 174]   165    174
    
    0 讨论(0)
提交回复
热议问题