add a field in pandas dataframe with MultiIndex columns

后端 未结 4 725
长发绾君心
长发绾君心 2021-02-06 00:00

i have looked for an answer to this question as it seems pretty simple, but have not been able to find anything yet. Apologies if I missed something. I have pandas version 0.1

相关标签:
4条回答
  • 2021-02-06 00:11

    for this particular problem, it seems like using a Panel object works. I did the following (taking dftst from my original post):

    pn = dftst.T.to_panel()
    print pn
    
    Out[83]: 
    <class 'pandas.core.panel.Panel'>
    Dimensions: 12 (items) x 3 (major_axis) x 2 (minor_axis)
    Items axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
    Major_axis axis: AAPL to GS
    Minor_axis axis: close to rate
    

    If I move the ('close', 'rate') to the Items by doing the following:

    pn = pn.transpose(2,0,1)
    print pn
    
    Out[91]: 
    <class 'pandas.core.panel.Panel'>
    Dimensions: 2 (items) x 12 (major_axis) x 3 (minor_axis)
    Items axis: close to rate
    Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
    Minor_axis axis: AAPL to GS
    

    Now I can do a time series operation and add it as a field in the Panel object:

    pn['avg_close'] = pandas.rolling_mean(pn['close'], 5)
    print pn
    
    Out[93]: 
    <class 'pandas.core.panel.Panel'>
    Dimensions: 3 (items) x 12 (major_axis) x 3 (minor_axis)
    Items axis: close to avg_close
    Major_axis axis: 2009-03-01 06:29:59 to 2009-03-12 06:29:59
    Minor_axis axis: AAPL to GS
    
    print pn['avg_close']
    
    Out[94]: 
    ticker                   AAPL      GOOG        GS
    2009-03-01 06:29:59       NaN       NaN       NaN
    2009-03-02 06:29:59       NaN       NaN       NaN
    2009-03-03 06:29:59       NaN       NaN       NaN
    2009-03-04 06:29:59       NaN       NaN       NaN
    2009-03-05 06:29:59  0.303719 -0.129300 -0.037954
    2009-03-06 06:29:59 -0.006839  0.206331  0.336467
    2009-03-07 06:29:59  0.128299  0.174935  0.698275
    2009-03-08 06:29:59  0.471010 -0.137343  0.671049
    2009-03-09 06:29:59 -0.279855 -0.033427  0.848610
    2009-03-10 06:29:59 -0.516032  0.260944  0.373046
    2009-03-11 06:29:59 -0.456213  0.164710  0.910448
    2009-03-12 06:29:59 -0.799156  0.544132  0.862764
    

    I am actually having some other problems with the Panel objects, but I will leave those to another post.

    0 讨论(0)
  • 2021-02-06 00:13

    This is a decade old but I had the exact same problem. here is a 1 line way to do what you are looking for. pandas 0.18 as been introduce so rolling mean is a bit different now, but you get the point.

    avg_close = dftst.xs('close', axis=1, level=1).rolling(5).mean()   
    dftst[zip(avg_close.columns, ['avg_close']*len(avg_close.columns))] = avg_close
    
    0 讨论(0)
  • 2021-02-06 00:16

    You could also (as a workaround since there isn't really an API that does exactly what you want ) consider a bit of reshaping-fu if you don't want to use a Panel. I wouldn't recommend it on enormous data sets, though: use a Panel for that.

    In [30]: df = dftst.stack(0)
    
    In [31]: df['close_avg'] = pd.rolling_mean(df.close.unstack(), 5).stack()
    
    In [32]: df
    Out[32]: 
    field                          close      rate  close_avg
                        ticker                               
    2009-03-01 06:29:59 AAPL   -0.223042  0.554996        NaN
                        GOOG    0.060127 -0.333992        NaN
                        GS      0.117626 -1.256790        NaN
    2009-03-02 06:29:59 AAPL   -0.513743 -0.402661        NaN
                        GOOG    0.059828 -0.125288        NaN
                        GS     -0.336196 -0.510595        NaN
    2009-03-03 06:29:59 AAPL    0.142202 -1.038470        NaN
                        GOOG   -1.099251 -0.892581        NaN
                        GS      1.698086  0.885023        NaN
    2009-03-04 06:29:59 AAPL   -1.125821  0.413005        NaN
                        GOOG    0.424290  1.106983        NaN
                        GS      0.047158  0.680714        NaN
    2009-03-05 06:29:59 AAPL    0.470050  1.845354  -0.250071
                        GOOG    0.132956 -0.488800  -0.084410
                        GS      0.129190  0.208077   0.331173
    2009-03-06 06:29:59 AAPL   -0.087360 -2.102512  -0.222934
                        GOOG    0.165100 -0.134886  -0.063415
                        GS      0.167720  0.082480   0.341192
    2009-03-07 06:29:59 AAPL   -0.768542 -0.176076  -0.273894
                        GOOG    0.417694  2.257074   0.008158
                        GS     -1.744730 -1.850185   0.059485
    2009-03-08 06:29:59 AAPL   -0.297363 -0.633828  -0.361807
                        GOOG   -1.096703 -0.572138   0.008667
                        GS      0.890016 -2.621563  -0.102129
    2009-03-09 06:29:59 AAPL    1.038579  0.053330   0.071073
                        GOOG   -0.614050  0.607944  -0.199001
                        GS     -0.882848  0.596801  -0.288130
    2009-03-10 06:29:59 AAPL   -0.255226  0.058178  -0.073982
                        GOOG    1.761861  1.841751   0.126780
                        GS     -0.549998 -1.551281  -0.423968
    2009-03-11 06:29:59 AAPL    0.413522  0.149089   0.026194
                        GOOG   -2.964163  1.825312  -0.499072
                        GS     -0.373303  1.137001  -0.532173
    2009-03-12 06:29:59 AAPL   -0.924776  1.238546  -0.005053
                        GOOG   -0.985956 -0.906590  -0.779802
                        GS     -0.320400  1.239681  -0.247307
    
    0 讨论(0)
  • 2021-02-06 00:20

    I don't know how to do the broadcasting you want but for strict assignment this should do it:

    dftst[(('GOOG', 'avg_close'))] = 7 
    

    More specifically but still without broadcasting:

    for tic in cols_1:
       dftst[(tic, 'avg_close')] = pandas.rolling_mean(dftst[(tic, 'close')],5) 
    
    0 讨论(0)
提交回复
热议问题