Python and lmfit: How to fit multiple datasets with shared parameters?

后端 未结 2 1212
执念已碎
执念已碎 2020-11-29 05:09

I would like to use the lmfit module to fit a function to a variable number of data-sets, with some shared and some individual parameters.

Here is an example generat

相关标签:
2条回答
  • 2020-11-29 05:19

    I think you're most of the way there. You need to put the data sets into an array or structure that can be used in a single, global objective function that you give to minimize() and fits all data sets with a single set of Parameters for all the data sets. You can share this set among data sets as you like. Expanding on your example a bit, the code below does work to do a single fit to the 5 different Gaussian functions. For an example of tying parameters across data sets, I used nearly identical value for sigma the 5 datasets the same value. I created 5 different sigma Parameters ('sig_1', 'sig_2', ..., 'sig_5'), but then forced these to have the same values using a mathematical constraint. Thus there are 11 variables in the problem, not 15.

    import numpy as np
    import matplotlib.pyplot as plt
    from lmfit import minimize, Parameters, report_fit
    
    def gauss(x, amp, cen, sigma):
        "basic gaussian"
        return amp*np.exp(-(x-cen)**2/(2.*sigma**2))
    
    def gauss_dataset(params, i, x):
        """calc gaussian from params for data set i
        using simple, hardwired naming convention"""
        amp = params['amp_%i' % (i+1)].value
        cen = params['cen_%i' % (i+1)].value
        sig = params['sig_%i' % (i+1)].value
        return gauss(x, amp, cen, sig)
    
    def objective(params, x, data):
        """ calculate total residual for fits to several data sets held
        in a 2-D array, and modeled by Gaussian functions"""
        ndata, nx = data.shape
        resid = 0.0*data[:]
        # make residual per data set
        for i in range(ndata):
            resid[i, :] = data[i, :] - gauss_dataset(params, i, x)
        # now flatten this to a 1D array, as minimize() needs
        return resid.flatten()
    
    # create 5 datasets
    x  = np.linspace( -1, 2, 151)
    data = []
    for i in np.arange(5):
        params = Parameters()
        amp   =  0.60 + 9.50*np.random.rand()
        cen   = -0.20 + 1.20*np.random.rand()
        sig   =  0.25 + 0.03*np.random.rand()
        dat   = gauss(x, amp, cen, sig) + np.random.normal(size=len(x), scale=0.1)
        data.append(dat)
    
    # data has shape (5, 151)
    data = np.array(data)
    assert(data.shape) == (5, 151)
    
    # create 5 sets of parameters, one per data set
    fit_params = Parameters()
    for iy, y in enumerate(data):
        fit_params.add( 'amp_%i' % (iy+1), value=0.5, min=0.0,  max=200)
        fit_params.add( 'cen_%i' % (iy+1), value=0.4, min=-2.0,  max=2.0)
        fit_params.add( 'sig_%i' % (iy+1), value=0.3, min=0.01, max=3.0)
    
    # but now constrain all values of sigma to have the same value
    # by assigning sig_2, sig_3, .. sig_5 to be equal to sig_1
    for iy in (2, 3, 4, 5):
        fit_params['sig_%i' % iy].expr='sig_1'
    
    # run the global fit to all the data sets
    result = minimize(objective, fit_params, args=(x, data))
    report_fit(result.fit_params)
    
    # plot the data sets and fits
    plt.figure()
    for i in range(5):
        y_fit = gauss_dataset(result.fit_params, i, x)
        plt.plot(x, data[i, :], 'o', x, y_fit, '-')
    
    plt.show()
    

    For what it's worth, I would consider holding the multiple data sets in a dictionary or list of DataSet class instead of a multi-dimensional array. Anyway, I hope this helps get you going onto what you really need to do.

    0 讨论(0)
  • 2020-11-29 05:36

    I've used simple approach: define a function firs n( = cargsnum) of arguments is common for all data set's other is individual {

    def likelihood_common(var, xlist, ylist, mlist, cargsnum):
        cvars = var[:cargsnum]
        iargnum = [model.func_code.co_argcount - 1 - cargsnum for model in mlist]
        argpos = [cargsnum,] + list(np.cumsum(iargnum[:-1]) + cargsnum)
        args = [list(cvars) + list(var[pos:pos+iarg]) for pos, iarg in zip(argpos, iargnum)]
        res = [likelihood(*arg) for arg in zip(args, xlist, ylist, mlist)]
        return np.sum(res)
    

    } here supposed that each data set have the same weight. The issue I faced in this approach is weary low computation speed and instability in case of large number of fitted parameters and data sets.

    0 讨论(0)
提交回复
热议问题