问题
I have a data frame like as shown below
df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
'date_2': ['07/23/2017','05/29/2017','02/03/2015',np.nan,np.nan]})
df1 = df1.melt('person_id', value_name='dates')
I would like to get the number of days to the previous and next year.
I am able to get the previous and next year using the below code
df1['cur_year'] = pd.DatetimeIndex(df1['dates']).year
df1['prev_year'] = (df1['cur_year'] - 1)
df1['next_year'] = (df1['cur_year'] + 1)
As you can see that the year
values are constantly changing for each row and I don't have a fixed baseline date, how can I calculate the difference in days to dates like 31/12
for the previous year and 01/01
for the next year.
Please note that end date is not included while getting the number of days
I have shown a sample output for 2 subjects below.
updated screenshot
回答1:
From what i understand, you can try;
df1['dates'] = pd.to_datetime(df1['dates'])
out = df1.assign(prev_yr_days=df1['dates'].dt.dayofyear,
next_yr_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days.add(1))
person_id variable dates prev_yr_days next_yr_days
0 11 date_1 1961-12-30 364.0 2.0
5 11 date_2 2017-07-23 204.0 162.0
1 21 date_1 1967-05-29 149.0 217.0
6 21 date_2 2017-05-29 149.0 217.0
2 31 date_1 1957-02-03 34.0 332.0
7 31 date_2 2015-02-03 34.0 332.0
3 41 date_1 1959-07-27 208.0 158.0
8 41 date_2 NaT NaN NaN
4 51 date_1 1971-01-13 13.0 353.0
9 51 date_2 NaT NaN NaN
回答2:
IIUC, we can conditionally create a previous and next year based on your row to sum against.
df1["next_year"] = (
pd.to_datetime(
"01-01-" + (df1["dates"].dt.year + 1).fillna(0).astype(int).astype(str)
)
- df1["dates"]
)
df1["prev_year"] = (df1['dates'] -
pd.to_datetime(
"31-12-" + (df1["dates"].dt.year - 1).fillna(0).astype(int).astype(str)
)
)
print(df1)
person_id variable dates next_year prev_year
0 11 date_1 1961-12-30 2 days 364 days
1 21 date_1 1967-05-29 217 days 149 days
2 31 date_1 1957-02-03 332 days 34 days
3 41 date_1 1959-07-27 158 days 208 days
4 51 date_1 1971-01-13 353 days 13 days
5 11 date_2 2017-07-23 162 days 204 days
6 21 date_2 2017-05-29 217 days 149 days
7 31 date_2 2015-02-03 332 days 34 days
8 41 date_2 NaT NaT NaT
9 51 date_2 NaT NaT NaT
回答3:
Here's one way to do it:
dates = df['dates'].astype('datetime64')
df1['prev_yr_days'] = dates.dt.dayofyear
df1['next_yr_days'] = dates.dt.is_leap_year.sub(df1['prev_yr_days']).add(366)
Result:
person_id variable dates prev_yr_day next_yr_days
0 11 date_1 12/30/1961 364.0 2.0
5 11 date_2 07/23/2017 204.0 162.0
1 21 date_1 05/29/1967 149.0 217.0
6 21 date_2 05/29/2017 149.0 217.0
2 31 date_1 02/03/1957 34.0 332.0
7 31 date_2 02/03/2015 34.0 332.0
3 41 date_1 7/27/1959 208.0 158.0
8 41 date_2 NaN NaN NaN
4 51 date_1 01/13/1971 13.0 353.0
9 51 date_2 NaN NaN NaN
来源:https://stackoverflow.com/questions/62432928/no-of-days-to-previous-and-next-year-pandas