How can I remove sharp jumps in data?

后端 未结 2 929
别跟我提以往
别跟我提以往 2021-02-09 06:35

I have some skin temperature data (collected at 1Hz) which I intend to analyse.

However, the sensors were not always in contact with the skin. So I have a challenge of

2条回答
  •  我在风中等你
    2021-02-09 07:20

    Try the code below (I used a tangent function to generate data). I used the second order difference idea from Mad Physicist in the comments.

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
    df = pd.DataFrame()
    df[0] = np.arange(0,10,0.005)
    df[1] = np.tan(df[0])
    
    #the following line calculates the absolute value of a second order finite 
    #difference (derivative)
    df[2] = 0.5*(df[1].diff()+df[1].diff(periods=-1)).abs()
    
    df.loc[df[2] < .05][1].plot() #select out regions of a high rate-of-change 
    df[1].plot()                  #plot original data
    
    plt.show()
    

    Following is a zoom of the output showing what got filtered. Matplotlib plots a line from beginning to end of the removed data.

    Your first question I believe is answered with the .loc selection above.

    You second question will take some experimentation with your dataset. The code above only selects out high-derivative data. You'll also need your threshold selection to remove zeroes or the like. You can experiment with where to make the derivative selection. You can also plot a histogram of the derivative to give you a hint as to what to select out.

    Also, higher order difference equations are possible to help with smoothing. This should help remove artifacts without having to trim around the cuts.

    Edit:

    A fourth-order finite difference can be applied using this:

    df[2] = (df[1].diff(periods=1)-df[1].diff(periods=-1))*8/12 - \
        (df[1].diff(periods=2)-df[1].diff(periods=-2))*1/12
    df[2] = df[2].abs()
    

    It's reasonable to think that it may help. The coefficients above can be worked out or derived from the following link for higher orders. Finite Difference Coefficients Calculator

    Note: The above second and fourth order central difference equations are not proper first derivatives. One must divide by the interval length (in this case 0.005) to get the actual derivative.

提交回复
热议问题