Python - Clipping out data to fit profiles

喜欢而已 提交于 2020-01-17 07:53:34

问题


I have several sets of data to which I'm trying to fit different profiles. In the centre of one of the minima there is contamination that prevents me from doing a good fit as you can see in this image:

How can I clip out those spikes in the bottom of my data taking into account that the spike is not always in the same position? Or how would you deal with data like this? I'm using lmfit to fit the profiles, in this case a Lorentzian and a Gaussian. Here is a minimal working example where I have played with the initial values to fit the data more closely:

import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
from lmfit.models import GaussianModel, ConstantModel, LorentzianModel

x = np.array([4085.18084467,  4085.38084374,  4085.5808428 , 4085.78084186, 4085.98084092,  4086.18083999,  4086.38083905,  4086.58083811, 4086.78083717,  4086.98083623,  4087.1808353 ,  4087.38083436, 4087.58083342,  4087.78083248,  4087.98083155,  4088.18083061, 4088.38082967,  4088.58082873,  4088.78082779,  4088.98082686, 4089.18082592,  4089.38082498,  4089.58082404,  4089.78082311, 4089.98082217,  4090.18082123,  4090.38082029,  4090.58081935, 4090.78081842,  4090.98081748,  4091.18081654,  4091.3808156 , 4091.58081466,  4091.78081373,  4091.98081279,  4092.18081185, 4092.38081091,  4092.58080998,  4092.78080904,  4092.9808081 , 4093.18080716,  4093.38080622,  4093.58080529,  4093.78080435, 4093.98080341,  4094.18080247,  4094.38080154,  4094.5808006 , 4094.78079966,  4094.98079872,  4095.18079778,  4095.38079685, 4095.58079591,  4095.78079497,  4095.98079403,  4096.1807931 , 4096.38079216,  4096.58079122,  4096.78079028,  4096.98078934, 4097.18078841,  4097.38078747,  4097.58078653,  4097.78078559,4097.98078466,  4098.18078372,  4098.38078278,  4098.58078184, 4098.7807809 ,  4098.98077997,  4099.18077903,  4099.38077809, 4099.58077715,  4099.78077622,  4099.98077528,  4100.18077434, 4100.3807734 ,  4100.58077246,  4100.78077153,  4100.98077059, 4101.18076965,  4101.38076871,  4101.58076778,  4101.78076684, 4101.9807659 ,  4102.18076496,  4102.38076402,  4102.58076309, 4102.78076215,  4102.98076121,  4103.18076027,  4103.38075934, 4103.5807584 ,  4103.78075746,  4103.98075652,  4104.18075558, 4104.38075465,  4104.58075371,  4104.78075277,  4104.98075183, 4105.1807509 ,  4105.38074996,  4105.58074902,  4105.78074808, 4105.98074714,  4106.18074621,  4106.38074527,  4106.58074433, 4106.78074339,  4106.98074246,  4107.18074152,  4107.38074058, 4107.58073964,  4107.7807387 ,  4107.98073777,  4108.18073683, 4108.38073589,  4108.58073495,  4108.78073401,  4108.98073308, 4109.18073214,  4109.3807312 ,  4109.58073026,  4109.78072933, 4109.98072839,  4110.18072745,  4110.38072651,  4110.58072557, 4110.78072464,  4110.9807237 ,  4111.18072276,  4111.38072182, 4111.58072089,  4111.78071995,  4111.98071901,  4112.18071807, 4112.38071713,  4112.5807162 ,  4112.78071526,  4112.98071432, 4113.18071338,  4113.38071245,  4113.58071151,  4113.78071057, 4113.98070963,  4114.18070869,  4114.38070776,  4114.58070682, 4114.78070588,  4114.98070494,  4115.18070401,  4115.38070307, 4115.58070213,  4115.78070119,  4115.98070025,  4116.18069932, 4116.38069838,  4116.58069744,  4116.7806965 ,  4116.98069557, 4117.18069463,  4117.38069369,  4117.58069275,  4117.78069181, 4117.98069088,  4118.18068994,  4118.380689  ,  4118.58068806, 4118.78068713,  4118.98068619,  4119.18068525,  4119.38068431, 4119.58068337,  4119.78068244,  4119.9806815 ,  4120.18068056, 4120.38067962,  4120.58067869,  4120.78067775,  4120.98067681, 4121.18067587,  4121.38067493,  4121.580674  ,  4121.78067306, 4121.98067212,  4122.18067118,  4122.38067025,  4122.58066931, 4122.78066837,  4122.98066743,  4123.18066649,  4123.38066556, 4123.58066462,  4123.78066368,  4123.98066274,  4124.1806618 , 4124.38066087,  4124.58065993,  4124.78065899,  4124.98065805, 4125.18065712,  4125.38065618,  4125.58065524,  4125.7806543 , 4125.98065336,  4126.18065243,  4126.38065149,  4126.58065055, 4126.78064961,  4126.98064868,  4127.18064774,  4127.3806468 , 4127.58064586,  4127.78064492,  4127.98064399,  4128.18064305, 4128.38064211,  4128.58064117,  4128.78064024,  4128.9806393 , 4129.18063836,  4129.38063742,  4129.58063648,  4129.78063555, 4129.98063461,  4130.18063367,  4130.38063273,  4130.5806318 , 4130.78063086,  4130.98062992,  4131.18062898,  4131.38062804, 4131.58062711,  4131.78062617,  4131.98062523,  4132.18062429, 4132.38062336,  4132.58062242,  4132.78062148,  4132.98062054, 4133.1806196 ,  4133.38061867,  4133.58061773,  4133.78061679, 4133.98061585,  4134.18061492,  4134.38061398,  4134.58061304, 4134.7806121 ,  4134.98061116])
y = np.array([0.90312759,  1.00923175,  0.94618369,  0.98284045,  0.91510612,        0.96737804,  0.97690214,  0.94363369,  1.00887784,  1.00110387,        0.91647096,  0.97943202,  1.00672907,  1.01552094,  1.01089407,        0.96914584,  0.9908419 ,  1.0176613 ,  0.97032148,  0.96003562,        0.9702355 ,  0.93684173,  0.94652734,  0.94895018,  1.01214356,        0.85777678,  0.89308203,  0.9789272 ,  0.93901884,  0.9684622 ,        0.96969321,  0.86326307,  0.89607392,  0.92459571,  1.00454429,        1.06019733,  0.97291196,  0.95646497,  0.95899707,  1.02830351,        0.94938178,  0.91481128,  0.92606219,  0.97085631,  0.93597434,        0.91316857,  0.90644542,  0.91726926,  0.91686184,  0.96445563,        0.92166362,  0.95831572,  0.93859066,  0.85285273,  0.89944073,        0.91812428,  0.94265677,  0.88281406,  0.9470601 ,  0.94921529,        0.97289222,  0.94632251,  0.96633195,  0.94096512,  0.95324803,        0.90920845,  0.92100257,  0.91181745,  0.95715298,  0.91715382,        0.90219214,  0.87585035,  0.86592191,  0.89335902,  0.85536392,        0.89619274,  0.9450366 ,  0.82780137,  0.81214176,  0.83461329,        0.82858317,  0.80851704,  0.79253546,  0.85440086,  0.81679169,        0.80579976,  0.72312218,  0.75583125,  0.75204599,  0.84519188,        0.68686821,  0.71472154,  0.71706318,  0.72640234,  0.70526356,        0.68295282,  0.66795774,  0.65004383,  0.68096834,  0.72697547,        0.72436393,  0.77128385,  0.79666758,  0.67349101,  0.61479406,        0.57046337,  0.51614312,  0.52945366,  0.53112169,  0.53757761,        0.56680358,  0.63839684,  0.60704329,  0.62377533,  0.67862515,        0.64587581,  0.71316115,  0.76309798,  0.72217569,  0.7477785 ,        0.79731849,  0.76934137,  0.77063868,  0.77871584,  0.77688526,        0.84342722,  0.85382332,  0.88700466,  0.85837992,  0.79589266,        0.83798993,  0.79835529,  0.84612746,  0.83214907,  0.86373676,        0.90729115,  0.82111605,  0.86165685,  0.84090099,  0.90389133,        0.89554032,  0.90792356,  0.92798016,  0.95588479,  0.95019718,        0.95447497,  0.89845759,  0.91638311,  0.99263342,  0.97477606,        0.95482538,  0.94489498,  0.94344967,  0.90526465,  0.92538486,        0.96279787,  0.94005143,  0.96842454,  0.92296494,  0.89954172,        0.8684367 ,  0.95039002,  0.95229769,  0.93752274,  0.94741173,        0.96704449,  1.01130839,  0.95499414,  0.99596569,  0.95130622,        1.00014723,  1.00252218,  0.95130331,  1.0022896 ,  0.99851989,        0.94405282,  0.95814021,  0.94851972,  1.01302067,  1.01400272,        0.97960083,  0.97070283,  1.01312797,  0.9842154 ,  1.01147273,       0.97331853,  0.91403182,  0.96813051,  0.92319169,  0.9294103 ,        0.96960715,  0.94811518,  0.97115083,  0.84687543,  0.90725159,        0.88061293,  0.87319615,  0.85331661,  0.89775082,  0.90956716,        0.83174505,  0.89753388,  0.89554364,  0.95329739,  0.87687031,        0.93883127,  0.97433899,  0.99515225,  0.97519981,  0.91956466,        0.97977674,  0.93582089,  1.00662722,  0.90157277,  1.02887754,        0.9777419 ,  0.94257094,  1.02359615,  0.98968414,  1.00075502,        1.03230265,  1.05904074,  1.00488442,  1.05507886,  1.05085518,        1.02561781,  1.05896008,  0.98024381,  1.08005691,  0.94528977,        1.03853637,  1.02064405,  1.0467137 ,  1.05375156,  1.12907949,        0.99295611,  1.06601022,  1.02846374,  0.98006807,  0.96446772,        0.97702428,  0.97788589,  0.93889781,  0.96366778,  0.96645265,        0.95857242,  1.05796304,  0.99441763,  1.00573183,  1.05001927])
e = np.array([0.0647344 ,  0.04583914,  0.05665552,  0.04447208,  0.05644753,        0.03968611,  0.05985188,  0.04252311,  0.03366922,  0.04237672,        0.03765898,  0.03290132,  0.04626836,  0.05106203,  0.03619188,        0.03944098,  0.08115469,  0.05859644,  0.06091101,  0.05170821,        0.0427244 ,  0.06804469,  0.06708318,  0.03369381,  0.04160575,        0.08007032,  0.09292148,  0.04378329,  0.08216214,  0.06087074,        0.05375458,  0.06185891,  0.06385766,  0.08084546,  0.04864063,        0.06400878,  0.04988693,  0.06689165,  0.05989534,  0.08010138,        0.0681177 ,  0.04478208,  0.03876582,  0.05977015,  0.06610619,        0.05020086,  0.07244604,  0.0445143 ,  0.06970626,  0.04423994,        0.0414573 ,  0.06892836,  0.05715395,  0.04014724,  0.07908425,        0.06082051,  0.08380691,  0.08576757,  0.06571406,  0.04842625,        0.05298355,  0.05271857,  0.06340425,  0.10849621,  0.0811072 ,        0.03642638,  0.10614094,  0.09865099,  0.06711037,  0.10244762,        0.11843505,  0.1092357 ,  0.09748241,  0.09657009,  0.09970179,        0.10203563,  0.18494082,  0.14097796,  0.1151294 ,  0.16172895,        0.17611204,  0.16226913,  0.2295418 ,  0.17795924,  0.1253298 ,        0.1771586 ,  0.15139061,  0.14739618,  0.1620105 ,  0.19158538,        0.21431605,  0.19292715,  0.23308884,  0.30519423,  0.31401994,        0.30569885,  0.31216375,  0.35147676,  0.25016472,  0.16232236,        0.09058787,  0.0604483 ,  0.05168302,  0.21432774,  0.38149791,        0.5061975 ,  0.44281541,  0.50646427,  0.43761581,  0.44989111,        0.47778238,  0.39944325,  0.32462726,  0.34560857,  0.3175776 ,        0.30253441,  0.23059451,  0.24516185,  0.20708065,  0.26429751,        0.1830661 ,  0.15155041,  0.16497299,  0.15794139,  0.13626666,        0.17839823,  0.13502886,  0.14148522,  0.10869864,  0.11723602,        0.09074029,  0.06922157,  0.07719777,  0.13181317,  0.11441895,        0.10655855,  0.12073767,  0.0846133 ,  0.07974657,  0.06538693,        0.0573741 ,  0.07864047,  0.08351471,  0.08130351,  0.0768824 ,        0.07951992,  0.04478989,  0.0765122 ,  0.04842814,  0.04355571,        0.05138656,  0.07215294,  0.04681987,  0.05790133,  0.06163808,        0.082449  ,  0.06127927,  0.04971221,  0.05107901,  0.04493687,        0.06072161,  0.06094332,  0.03630467,  0.04162285,  0.04058228,        0.04526251,  0.06191432,  0.04901982,  0.0454908 ,  0.06186274,        0.0407017 ,  0.03865571,  0.04353665,  0.03898987,  0.04666321,        0.05856035,  0.04225933,  0.04797901,  0.03523971,  0.04728414,        0.05494382,  0.04773011,  0.03210954,  0.05651663,  0.03625933,        0.03596701,  0.03800191,  0.06267668,  0.06431192,  0.0602614 ,        0.05139896,  0.04571979,  0.04375182,  0.0576867 ,  0.07491418,        0.05339972,  0.07619115,  0.11569378,  0.07087871,  0.09076518,        0.13554717,  0.07811761,  0.07180695,  0.05831886,  0.06042863,        0.08759576,  0.06650081,  0.08420164,  0.08185432,  0.04338836,        0.04970979,  0.04008252,  0.03605485,  0.03456321,  0.05594584,        0.03856822,  0.03576337,  0.03118799,  0.0441686 ,  0.0469118 ,        0.03591666,  0.03562582,  0.04934832,  0.03280972,  0.03201576,        0.04338048,  0.07443531,  0.04121059,  0.03774147,  0.03717577,        0.03354207,  0.03806978,  0.0319364 ,  0.03715712,  0.0379478 ,        0.04867626,  0.0304592 ,  0.03393844,  0.034518  ,  0.04293514,        0.05177898,  0.05332907,  0.0352937 ,  0.03359781,  0.04625272,        0.03733088,  0.03501259,  0.03346308,  0.04333749,  0.05741173])

cont = ConstantModel(prefix='cte_')
pars = cont.guess(y, x=x)

gauss = GaussianModel(prefix='g_')
pars.update( gauss.make_params())    
pars['cte_c'].set(1)
pars['g_center'].set(4125, min=4120, max=4130)
pars['g_sigma'].set(1, min=0.5)
pars['g_amplitude'].set(-0.2, min=-0.5)

loren = LorentzianModel(prefix='l_')
pars.update( loren.make_params())    
pars['l_center'].set(4106, min=4095, max=4115)
pars['l_sigma'].set(4, max=6)
pars['l_amplitude'].set(-6., max=-4.)

model = gauss + loren + cont

init = model.eval(pars, x=x)
result = model.fit(y, pars, x=x, weights=1/e)

#print(result.fit_report(min_correl=0.5))

fig, ax = plt.subplots(figsize=(8,6))

ax.plot(x, y, 'k-', lw=2) # data in red
ax.plot(x, init, 'g--', lw=2) # initial guess 
ax.plot(x, result.best_fit, 'r-', lw=2) # best fit
ax.set(xlim=(4085,4135), ylim=(0.4,1.14))

回答1:


If the bad point is always at the same x value, you could remove that point from the data, perhaps with something like:

import numpy as np
def index_nearest(array, value):
    """index of array nearest to value"""
    return np.abs(array-value).argmin()

ybad = index_nearest(x, 4150)
y[ybad] = x[ybad] = np.nan
x = x[np.where(np.isfinite(y))]
y = y[np.where(np.isfinite(y))]

and then fit your model to those data with the bad point removed.

But, also: if there is not an obviously errant point and the data "just" noisy, there is probably no advantage to removing what looks like bad points. Your data looks noisy to me, but it's hard to see that there is a systematically bad point. If you are going to remove a point, remember that you are asserting that this measurement was not merely affected by normal noise, but was wrong.

Finally: another approach to treating noisy data might be to try to smooth the data, say with a Savitzky-Golay filter. There is always some danger of smoothing out features with such an approach, but a modest S-G filter is often good for cleaning up noisy data enough to detect features. Of course, if fits to filtered data give significantly different results from fits to unfiltered data, you will probably need to understand why that is.



来源:https://stackoverflow.com/questions/46228031/python-clipping-out-data-to-fit-profiles

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!