calculate datetime-difference in years, months, etc. in a new pandas dataframe column

后端 未结 7 2275
长发绾君心
长发绾君心 2021-02-07 05:34

I have a pandas dataframe looking like this:

Name    start        end
A       2000-01-10   1970-04-29

I want to add a new column providing the

7条回答
  •  失恋的感觉
    2021-02-07 06:14

    You can try the following function to calculate the difference -

    def yearmonthdiff(row):
        s = row['start']
        e = row['end']
        y = s.year - e.year
        m = s.month - e.month
        d = s.day - e.day
        if m < 0:
            y = y - 1
            m = m + 12
        if m == 0:
            if d < 0:
                m = m -1
            elif d == 0:
                s1 = s.hour*3600 + s.minute*60 + s.second
                s2 = e.hour*3600 + e.minut*60 + e.second
                if s1 < s2:
                    m = m - 1
        return '{}y{}m'.format(y,m)
    

    Where row is the dataframe row . I am assuming your start and end columns are datetime objects. Then you can use DataFrame.apply() function to apply it to each row.

    df
    
    Out[92]:
                           start                        end
    0 2000-01-10 00:00:00.000000 1970-04-29 00:00:00.000000
    1 2015-07-18 17:54:59.070381 2014-01-11 17:55:10.053381
    
    df['diff'] = df.apply(yearmonthdiff, axis=1)
    
    In [97]: df
    Out[97]:
                           start                        end   diff
    0 2000-01-10 00:00:00.000000 1970-04-29 00:00:00.000000  29y9m
    1 2015-07-18 17:54:59.070381 2014-01-11 17:55:10.053381   1y6m
    

提交回复
热议问题