Pandas fill in missing dates in DataFrame with multiple columns

瘦欲@ 提交于 2021-02-19 07:41:48

问题


I want to add missing dates for a specific date range, but keep all columns. I found many posts using afreq(), resample(), reindex(), but they seemed to be for Series and I couldn't get them to work for my DataFrame.

Given a sample dataframe:

data = [{'id' : '123', 'product' : 'apple', 'color' : 'red', 'qty' : 10, 'week' : '2019-3-7'}, {'id' : '123', 'product' : 'apple', 'color' : 'blue', 'qty' : 20, 'week' : '2019-3-21'}, {'id' : '123', 'product' : 'orange', 'color' : 'orange', 'qty' : 8, 'week' : '2019-3-21'}]

df = pd.DataFrame(data)


    color   id product  qty       week
0     red  123   apple   10   2019-3-7
1    blue  123   apple   20  2019-3-21
2  orange  123  orange    8  2019-3-21

My goal is to return below; filling in qty as 0, but fill other columns. Of course, I have many other ids. I would like to be able to specify the start/end dates to fill; this example uses 3/7 to 3/21.

    color   id product  qty       week
0     red  123   apple   10   2019-3-7
1    blue  123   apple   20  2019-3-21
2  orange  123  orange    8  2019-3-21
3     red  123   apple    0  2019-3-14
4     red  123   apple    0  2019-3-21 
5    blue  123   apple    0   2019-3-7
6    blue  123   apple    0  2019-3-14
7  orange  123  orange    0   2019-3-7
8  orange  123  orange    0  2019-3-14

How can I keep the remainder of my DataFrame intact?


回答1:


In you case , you just need do with unstack and stack + reindex

df.week=pd.to_datetime(df.week)
s=pd.date_range(df.week.min(),df.week.max(),freq='7 D')

df=df.set_index(['color','id','product','week']).\
      qty.unstack().reindex(columns=s,fill_value=0).stack().reset_index()
df

    color   id product    level_3     0
0    blue  123   apple 2019-03-14   0.0
1    blue  123   apple 2019-03-21  20.0
2  orange  123  orange 2019-03-14   0.0
3  orange  123  orange 2019-03-21   8.0
4     red  123   apple 2019-03-07  10.0
5     red  123   apple 2019-03-14   0.0


来源:https://stackoverflow.com/questions/55035939/pandas-fill-in-missing-dates-in-dataframe-with-multiple-columns

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!