Adding a column thats result of difference in consecutive rows in pandas

后端 未结 4 1297
旧巷少年郎
旧巷少年郎 2020-11-28 23:07

Lets say I have a dataframe like this

    A   B
0   a   b
1   c   d
2   e   f 
3   g   h

0,1,2,3 are times, a, c, e, g is one time series a

相关标签:
4条回答
  • 2020-11-28 23:33

    When using data in CSV, this would work perfectly:

    my_data = pd.read_csv('sale_data.csv')
    df = pd.DataFrame(my_data)
    df['New_column'] = df['target_column'].diff(1)
    print(df) #for the console but not necessary 
    
    0 讨论(0)
  • 2020-11-28 23:49

    Use shift.

    df['dA'] = df['A'] - df['A'].shift(-1)
    
    0 讨论(0)
  • 2020-11-28 23:49

    Rolling differences can also be calculated this way:

    df=pd.DataFrame(my_data)
    my_data = pd.read_csv('sales_data.csv')
    i=0
    j=1
    while j < len(df['Target_column']):
        j=df['Target_column'][i+1] - df['Target_column'][i] #the difference btwn two values in a column.
        i+=1 #move to the next value in the column.
        j+=1 #next value in the new column.
        print(j)
    
    0 讨论(0)
  • 2020-11-28 23:50

    You could use diff and pass -1 as the periods argument:

    >>> df = pd.DataFrame({"A": [9, 4, 2, 1], "B": [12, 7, 5, 4]})
    >>> df["dA"] = df["A"].diff(-1)
    >>> df
       A   B  dA
    0  9  12   5
    1  4   7   2
    2  2   5   1
    3  1   4 NaN
    
    [4 rows x 3 columns]
    
    0 讨论(0)
提交回复
热议问题