Pandas reverse of diff()

前端未结

关注

 3  844

I have calculated the differences between consecutive values in a series, but I cannot reverse / undifference them using diffinv():

ds_sqrt =


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  心在旅途        
                
              
                            
                2021-01-11 18:44
              
            
            
                                                                       
df.cumsum()

Example:
data = {'a':[1,6,3,9,5], 'b':[13,1,2,5,23]}
df = pd.DataFrame(data)

df = 
    a   b
0   1   13
1   6   1
2   3   2
3   9   5
4   5   23

df.diff()

a   b
0   NaN NaN
1   5.0 -12.0
2   -3.0    1.0
3   6.0 3.0
4   -4.0    18.0

df.cumsum()

a   b
0   1   13
1   7   14
2   10  16
3   19  21
4   24  44

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2021-01-11 18:48
              
            
            
                                                                       
You can do this via numpy. Algorithm courtesy of @Divakar.

Of course, you need to know the first item in your series for this to work.

df = pd.DataFrame({'A': np.random.randint(0, 10, 10)})
df['B'] = df['A'].diff()

x, x_diff = df['A'].iloc[0], df['B'].iloc[1:]
df['C'] = np.r_[x, x_diff].cumsum().astype(int)

#    A    B  C
# 0  8  NaN  8
# 1  5 -3.0  5
# 2  4 -1.0  4
# 3  3 -1.0  3
# 4  9  6.0  9
# 5  7 -2.0  7
# 6  4 -3.0  4
# 7  0 -4.0  0
# 8  8  8.0  8
# 9  1 -7.0  1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-11 18:56
              
            
            
                                                                       
You can use diff_inv from pmdarima.Docs link
# genarating random table
  np.random.seed(10)
  vals = np.random.randint(1, 10, 6)
  df_t = pd.DataFrame({"a":vals})

  #creating two columns with diff 1 and diff 2
  df_t['dif_1'] = df_t.a.diff(1)
  df_t['dif_2'] = df_t.a.diff(2)

  df_t

    a   dif_1   dif_2
  0 5   NaN     NaN
  1 1   -4.0    NaN
  2 2   1.0    -3.0
  3 1   -1.0    0.0
  4 2   1.0     0.0
  5 9   7.0     8.0

Then create a function that will return an array with inverse values of diff.
from pmdarima.utils import diff_inv

def inv_diff (df_orig_column,df_diff_column, periods):
# Generate np.array for the diff_inv function - it includes first n values(n = 
# periods) of original data & further diff values of given periods
value = np.array(df_orig_column[:periods].tolist()+df_diff_column[periods:].tolist())

# Generate np.array with inverse diff
inv_diff_vals = diff_inv(value, periods,1 )[periods:]
return inv_diff_vals

Example of Use:
# df_orig_column - column with original values
# df_diff_column - column with differentiated values
# periods - preiods for pd.diff()
inv_diff(df_t.a, df_t.dif_2, 2) 

Output:
array([5., 1., 2., 1., 2., 9.])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复