how to apply Functions on numpy arrays using pandas groupby function

前端 未结 2 1541
说谎
说谎 2020-12-19 19:37

I\'m very new to pandas so I hope this will have an easy answer (and I also appreciate all pointers to even the setup of the dataframe)

So let\'s say I have the foll

相关标签:
2条回答
  • 2020-12-19 19:56

    For me it works:

    D.groupby('gp').apply(lambda x: x.vector.mean().mean())
    

    I'm taking the mean twice, since you want the mean group value for the mean of the vector (don't you?).

    Out[98]: 
    gp
    0     9.0
    1     8.5
    2     9.5
    dtype: float64
    

    If you want the mean vector, just take the mean once.

    0 讨论(0)
  • 2020-12-19 20:11

    arrays in cell is not a good idea, you can convert the vector col to multi cols:

    D = pd.DataFrame({ i:{ "name":str(i),
                           "vector": np.arange(i,i+10),
                           "sq":i**2,
                           "gp":i%3 } for i in range(10) }).T
    df = pd.concat([D[["gp", "name", "sq"]], pd.DataFrame(D.vector.tolist(), index=D.index)], axis=1, keys=["attrs", "vector"])
    print df.groupby([("attrs", "gp")]).mean()
    

    here is the output:

                      vector                                                  
                      0    1    2    3    4     5     6     7     8     9
    (attrs, gp)                                                          
    0               4.5  5.5  6.5  7.5  8.5   9.5  10.5  11.5  12.5  13.5
    1               4.0  5.0  6.0  7.0  8.0   9.0  10.0  11.0  12.0  13.0
    2               5.0  6.0  7.0  8.0  9.0  10.0  11.0  12.0  13.0  14.0
    
    0 讨论(0)
提交回复
热议问题