I am working with python / numpy. As input data I have a large number of value pairs (x,y)
. I basically want to plot
, i.e., the mean value
You are complicating things unnecessarily. All you need to know is, for every bin in x
, what are n
, sy
and sy2
, the number of y
values in that x
bin, the sum of those y
values, and the sum of their squares. You can get those as:
>>> n, _ = np.histogram(x, bins=xbins)
>>> sy, _ = np.histogram(x, bins=xbins, weights=y)
>>> sy2, _ = np.histogram(x, bins=xbins, weights=y*y)
From those:
>>> mean = sy / n
>>> std = np.sqrt(sy2/n - mean*mean)
If you can use pandas:
import pandas as pd
xedges = np.linspace(x.min(), x.max(), xbins+1)
xedges[0] -= 0.00001
xedges[-1] += 0.000001
c = pd.cut(x, xedges)
g = pd.groupby(pd.Series(y), c.labels)
mean2 = g.mean()
std2 = g.std(0)