Pandas reverse of diff()

前端 未结 3 844
小蘑菇
小蘑菇 2021-01-11 18:33

I have calculated the differences between consecutive values in a series, but I cannot reverse / undifference them using diffinv():

ds_sqrt =         


        
相关标签:
3条回答
  • 2021-01-11 18:44

    df.cumsum()

    Example:
    data = {'a':[1,6,3,9,5], 'b':[13,1,2,5,23]}
    df = pd.DataFrame(data)
    
    df = 
        a   b
    0   1   13
    1   6   1
    2   3   2
    3   9   5
    4   5   23
    
    df.diff()
    
    a   b
    0   NaN NaN
    1   5.0 -12.0
    2   -3.0    1.0
    3   6.0 3.0
    4   -4.0    18.0
    
    df.cumsum()
    
    a   b
    0   1   13
    1   7   14
    2   10  16
    3   19  21
    4   24  44
    
    0 讨论(0)
  • 2021-01-11 18:48

    You can do this via numpy. Algorithm courtesy of @Divakar.

    Of course, you need to know the first item in your series for this to work.

    df = pd.DataFrame({'A': np.random.randint(0, 10, 10)})
    df['B'] = df['A'].diff()
    
    x, x_diff = df['A'].iloc[0], df['B'].iloc[1:]
    df['C'] = np.r_[x, x_diff].cumsum().astype(int)
    
    #    A    B  C
    # 0  8  NaN  8
    # 1  5 -3.0  5
    # 2  4 -1.0  4
    # 3  3 -1.0  3
    # 4  9  6.0  9
    # 5  7 -2.0  7
    # 6  4 -3.0  4
    # 7  0 -4.0  0
    # 8  8  8.0  8
    # 9  1 -7.0  1
    
    0 讨论(0)
  • 2021-01-11 18:56

    You can use diff_inv from pmdarima.Docs link

    # genarating random table
      np.random.seed(10)
      vals = np.random.randint(1, 10, 6)
      df_t = pd.DataFrame({"a":vals})
    
      #creating two columns with diff 1 and diff 2
      df_t['dif_1'] = df_t.a.diff(1)
      df_t['dif_2'] = df_t.a.diff(2)
    
      df_t
    
        a   dif_1   dif_2
      0 5   NaN     NaN
      1 1   -4.0    NaN
      2 2   1.0    -3.0
      3 1   -1.0    0.0
      4 2   1.0     0.0
      5 9   7.0     8.0
    

    Then create a function that will return an array with inverse values of diff.

    from pmdarima.utils import diff_inv
    
    def inv_diff (df_orig_column,df_diff_column, periods):
    # Generate np.array for the diff_inv function - it includes first n values(n = 
    # periods) of original data & further diff values of given periods
    value = np.array(df_orig_column[:periods].tolist()+df_diff_column[periods:].tolist())
    
    # Generate np.array with inverse diff
    inv_diff_vals = diff_inv(value, periods,1 )[periods:]
    return inv_diff_vals
    

    Example of Use:

    # df_orig_column - column with original values
    # df_diff_column - column with differentiated values
    # periods - preiods for pd.diff()
    inv_diff(df_t.a, df_t.dif_2, 2) 
    

    Output:

    array([5., 1., 2., 1., 2., 9.])
    
    0 讨论(0)
提交回复
热议问题