Calculating difference between two rows in Python / Pandas

前端 未结 3 1165
不知归路
不知归路 2020-11-30 01:19

In python, how can I reference previous row and calculate something against it? Specifically, I am working with dataframes in pandas - I have a da

相关标签:
3条回答
  • 2020-11-30 01:32

    I think you want to do something like this:

    In [26]: data
    Out[26]: 
               Date   Close  Adj Close
    251  2011-01-03  147.48     143.25
    250  2011-01-04  147.64     143.41
    249  2011-01-05  147.05     142.83
    248  2011-01-06  148.66     144.40
    247  2011-01-07  147.93     143.69
    
    In [27]: data.set_index('Date').diff()
    Out[27]: 
                Close  Adj Close
    Date                        
    2011-01-03    NaN        NaN
    2011-01-04   0.16       0.16
    2011-01-05  -0.59      -0.58
    2011-01-06   1.61       1.57
    2011-01-07  -0.73      -0.71
    
    0 讨论(0)
  • 2020-11-30 01:32

    To calculate difference of one column. Here is what you can do.

    df=
          A      B
    0     10     56
    1     45     48
    2     26     48
    3     32     65
    

    We want to compute row difference in A only and want to consider the rows which are less than 15.

    df['A_dif'] = df['A'].diff()
    df=
              A      B      A_dif
        0     10     56      Nan
        1     45     48      35
        2     26     48      19
        3     32     65      6
    df = df[df['A_dif']<15]
    
    df=
              A      B      A_dif
        0     10     56      Nan
        3     32     65      6
    
    0 讨论(0)
  • 2020-11-30 01:52

    I don't know pandas, and I'm pretty sure it has something specific for this; however, I'll give you the pure-Python solution, that might be of some help even if you need to use pandas:

    import csv
    import urllib
    
    # This basically retrieves the CSV files and loads it in a list, converting
    # All numeric values to floats
    url='http://ichart.finance.yahoo.com/table.csv?s=IBM&a=00&b=1&c=2011&d=11&e=31&f=2011&g=d&ignore=.csv'
    reader = csv.reader(urllib.urlopen(url), delimiter=',')
    # We sort the output list so the records are ordered by date
    cleaned = sorted([[r[0]] + map(float, r[1:]) for r in list(reader)[1:]])
    
    for i, row in enumerate(cleaned):  # enumerate() yields two-tuples: (<id>, <item>)
        # The try..except here is to skip the IndexError for line 0
        try:
            # This will calculate difference of each numeric field with the same field
            # in the row before this one
            print row[0], [(row[j] - cleaned[i-1][j]) for j in range(1, 7)]
        except IndexError:
            pass
    
    0 讨论(0)
提交回复
热议问题