efficient function to find harmonic mean across different pandas dataframes

问题

I have several dataframes with identical shape/types, but slightly different numeric values. I can easily produce a new dataframe with the mean of all input dataframes via:

df = pd.concat([input_dataframes])
df = df.groupby(df.index).mean()

I want to do the same with harmonic mean (probably the scipy.stats.hmean function). I have attempted to do this using:

.groupby(df.index).apply(scipy.stats.hmean)

But this alters the structure of the dataframe. Is there a better way to do this, or do I need to use a more lengthly/manual implementation?

To illustrate :

df_input1:
   'a' 'b' 'c'
'x' 1   1   2 
'y' 2   2   4 
'z' 3   3   6

df_input2:
   'a' 'b' 'c'
'x' 2   2   4 
'y' 3   3   6 
'z' 4   4   8

desired output (but w/ hmean):
   'a'  'b'  'c'
'x' 1.5  1.5  3 
'y' 2.5  2.5  5 
'z' 3.5  3.5  7

回答1:

Create a pandas Panel, and apply the harmonic mean function over the 'item' axis.

Example with your dataframes df1 and df2:

import pandas as pd
from scipy import stats

d = {'1':df1,'2':df2}
pan = pd.Panel(d)
pan.apply(axis='items',func=stats.hmean)

yields:

        'a'         'b'         'c'
'x'     1.333333    1.333333    2.666667
'y'     2.400000    2.400000    4.800000
'z'     3.428571    3.428571    6.857143

来源：https://stackoverflow.com/questions/39281575/efficient-function-to-find-harmonic-mean-across-different-pandas-dataframes

标签

python

pandas

scipy

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!