Mean values depending on binning with respect to second variable

后端 未结 2 657
北海茫月
北海茫月 2021-02-10 00:15

I am working with python / numpy. As input data I have a large number of value pairs (x,y). I basically want to plot (x), i.e., the mean value

相关标签:
2条回答
  • 2021-02-10 00:45

    You are complicating things unnecessarily. All you need to know is, for every bin in x, what are n, sy and sy2, the number of y values in that x bin, the sum of those y values, and the sum of their squares. You can get those as:

    >>> n, _ = np.histogram(x, bins=xbins)
    >>> sy, _ = np.histogram(x, bins=xbins, weights=y)
    >>> sy2, _ = np.histogram(x, bins=xbins, weights=y*y)
    

    From those:

    >>> mean = sy / n
    >>> std = np.sqrt(sy2/n - mean*mean)
    
    0 讨论(0)
  • 2021-02-10 01:02

    If you can use pandas:

    import pandas as pd
    xedges = np.linspace(x.min(), x.max(), xbins+1)
    xedges[0] -= 0.00001
    xedges[-1] += 0.000001
    c = pd.cut(x, xedges)
    g = pd.groupby(pd.Series(y), c.labels)
    mean2 = g.mean()
    std2 = g.std(0)
    
    0 讨论(0)
提交回复
热议问题