Insert missing weekdays in pandas dataframe and fill them with NaN

纵然是瞬间 提交于 2019-12-10 10:49:51

问题


I am trying to insert missing weekdays in a time series dataframe such has

import pandas as pd
from pandas.tseries.offsets import *
df = pd.DataFrame([['2016-09-30', 10, 2020], ['2016-10-03', 20, 2424], ['2016-10-05', 5, 232]], columns=['date', 'price', 'vol']).set_index('date')
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')

data looks like this :

Out[300]: 
            price   vol
date                   
2016-09-30     10  2020
2016-10-03     20  2424
2016-10-05      5   232

I can create a series of week days easily with pd.date_range()

pd.date_range('2016-09-30', '2016-10-05', freq=BDay())
Out[301]: DatetimeIndex(['2016-09-30', '2016-10-03', '2016-10-04', '2016-10-05'], dtype='datetime64[ns]', freq='B')

based on that DateTimeIndex I would like to add missing dates in my dfand fill column values with NaN so I get:

Out[300]: 
            price   vol
date                   
2016-09-30     10  2020
2016-10-03     20  2424
2016-10-04     NaN  NaN
2016-10-05      5   232

is there an easy way to do this? Thanks!


回答1:


You can use reindex:

df.index = pd.to_datetime(df.index)

df.reindex(pd.date_range('2016-09-30', '2016-10-05', freq=BDay()))
Out: 
            price     vol
2016-09-30   10.0  2020.0
2016-10-03   20.0  2424.0
2016-10-04    NaN     NaN
2016-10-05    5.0   232.0



回答2:


Alternatively, you can use pandas.DataFrame.resample(), specifying 'B' for Business Day with no need to specify beginning or end date sequence as along as the dataframe maintains a datetime index

df = df.resample('B').sum()

#             price     vol
# date                     
# 2016-09-30   10.0  2020.0
# 2016-10-03   20.0  2424.0
# 2016-10-04    NaN     NaN
# 2016-10-05    5.0   232.0


来源:https://stackoverflow.com/questions/39877184/insert-missing-weekdays-in-pandas-dataframe-and-fill-them-with-nan

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!