问题
I have a set of measurement curves (represented as an array of x and an array of y values). I need to calculate the average curve of this set. The curves in the set may vary both in number of sampling points (x-values) and position of the sampling points.
I first interpolate every curve linearly using scipys interp1d. I then determine the range of x-values where all curves overlap, in order for the interpolated functions to be defined. Finally i need to calculate the mean, this is where I am stuck.
回答1:
I'm afraid that your question is rather conceptual than coding related. However, the following example should help you:
import numpy as np
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
# make up three datasets for testing
x1 = np.linspace(0, 10, num=11, endpoint=True)
x2 = np.linspace(0, 10, num=13, endpoint=True)
x3 = np.linspace(0, 10, num=23, endpoint=True)
y1 = np.cos(-x1**2/9.0) + 0.2*np.random.rand((len(x1)))
y2 = np.cos(-x2**2/9.0) + 0.2*np.random.rand((len(x2)))
y3 = np.cos(-x3**2/9.0) + 0.2*np.random.rand((len(x3)))
# interpolate data
f1 = interp1d(x1, y1,'cubic')
f2 = interp1d(x2, y2,'cubic')
f3 = interp1d(x3, y3,'cubic')
# define common carrier for calculation of average curve
x_all = np.linspace(0, 10, num=101, endpoint=True)
# evaluation of fits on common carrier
f1_int = f1(x_all)
f2_int = f2(x_all)
f3_int = f3(x_all)
# put all fits to one matrix for fast mean calculation
data_collection = np.vstack((f1_int,f2_int,f3_int))
# calculating mean value
f_avg = np.average(data_collection, axis=0)
# plot this example
plt.figure()
plt.hold('on')
plt.plot(x1,y1,'ro',label='row1')
plt.plot(x2,y2,'bo',label='row2')
plt.plot(x3,y3,'go',label='row3')
plt.plot(x_all,f1_int,'r-',label='fit1')
plt.plot(x_all,f2_int,'b-',label='fit2')
plt.plot(x_all,f3_int,'g-',label='fit3')
plt.plot(x_all, f_avg,'k--',label='fit average')
plt.legend(loc=3)
plt.hold('off')
plt.show()
The most important lines are those using np.vstack
to combine the measurements and np.average
to take the average of the measurements. The rest is just to have a working example!
EDIT: For nonequidistant carrier of the fits do e.g. the following:
# define common carrier for calculation of average curve
x_all_1 = np.linspace(0, 1, num=101, endpoint=True)
x_all_2 = np.linspace(1, 10, num=21, endpoint=True)
x_all = np.concatenate((x_all_1, x_all_2))
来源:https://stackoverflow.com/questions/33625236/calculate-mean-over-discrete-functions-with-different-amount-of-sampling-points