Implementing pandas function to numpy functions

ε祈祈猫儿з 提交于 2021-01-27 19:20:44

问题


Is there a way I could convert the xy_mean function to be computed using the pandas library just like the y_mean function. I found out that the pandas function Y_mean = pd.Series(PC_list).rolling(number).mean().dropna().to_numpy() is way faster than the numpy version ym = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]. The equation for the xy_mean would be ((index of value)*value + (index of value)*value)/number The index number would be dependent on the variable numbers value. So the first set of calculations for the example below would be (457.334015*1 + 424.440002*2 +394.795990*3)/number and the next set of numbers would be (424.440002*2 +394.795990*3 + 408.903992*4)/number and so on. If number = 4 Than the first set of calculations would be (457.334015*1 + 424.440002*2 +394.795990*3 +408.903992*4)/number. The set mean calculations would go on until the end of the PC_list array.

variables:

number = 3
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Vanilla python version:

y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number

Numpy versions:

y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]

Pandas version

Y_mean = pd.Series(PC_list).rolling(number).mean().dropna().to_numpy()
xy_mean = ? 

回答1:


You would need to define a custom function for that, and pass it to rolling.apply:

>>> multiplier = np.arange(0, number)

>>> def xymean(series):
        return series.mul(multiplier).sum()

>>> pd.Series(PC_list).rolling(number).apply(xymean).dropna().to_numpy()[:-1]

array([2490.601989, 2440.743958, 2409.067016, 2413.002044, 2510.497985,
       2543.348939, 2516.922974, 2459.627961, 2418.983948, 2335.007966,
       2280.283019, 2288.94702 , 2300.19998 , 2279.389953, 2212.294951,
       2080.693968, 1978.774017, 1960.123047, 1989.229066, 2061.27304 ,
       2137.145019, 2167.67804 , 2175.047058, 2221.807067, 2290.639036,
       2361.986998, 2376.473021])

>>> (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
 
array([2490.601989, 2440.743958, 2409.067016, 2413.002044, 2510.497985,
       2543.348939, 2516.922974, 2459.627961, 2418.983948, 2335.007966,
       2280.283019, 2288.94702 , 2300.19998 , 2279.389953, 2212.294951,
       2080.693968, 1978.774017, 1960.123047, 1989.229066, 2061.27304 ,
       2137.145019, 2167.67804 , 2175.047058, 2221.807067, 2290.639036,
       2361.986998, 2376.473021])

However, this will be a little slower, owing to the apply. Furthermore, it seems like your numpy version creates xy_sum as opposed to xy_mean, to make it calculate mean you would need:

>>> (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid')/number)[:-1]

array([830.200663  , 813.58131933, 803.02233867, 804.33401467,
       836.83266167, 847.78297967, 838.97432467, 819.875987  ,
       806.32798267, 778.33598867, 760.09433967, 762.98234   ,
       766.73332667, 759.796651  , 737.43165033, 693.564656  ,
       659.591339  , 653.374349  , 663.07635533, 687.09101333,
       712.381673  , 722.55934667, 725.015686  , 740.60235567,
       763.54634533, 787.32899933, 792.15767367])

>>> def xymean(series):
        return series.mul(multiplier).mean()

>>> pd.Series(PC_list).rolling(number).apply(xymean).dropna().to_numpy()[:-1]

array([830.200663  , 813.58131933, 803.02233867, 804.33401467,
       836.83266167, 847.78297967, 838.97432467, 819.875987  ,
       806.32798267, 778.33598867, 760.09433967, 762.98234   ,
       766.73332667, 759.796651  , 737.43165033, 693.564656  ,
       659.591339  , 653.374349  , 663.07635533, 687.09101333,
       712.381673  , 722.55934667, 725.015686  , 740.60235567,
       763.54634533, 787.32899933, 792.15767367])


来源:https://stackoverflow.com/questions/65866920/implementing-pandas-function-to-numpy-functions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!