Pandas- adding missing dates to DataFrame while keeping column/index values?

后端 未结 4 1699
谎友^
谎友^ 2021-01-22 16:23

I have a pandas dataframe that incorporates dates, customers, items, and then dollar value for purchases.

   date     customer   product   amt  
 1/1/2017   tim         


        
4条回答
  •  一个人的身影
    2021-01-22 16:57

    IIUC you can do it this way:

    In [63]: dates = pd.date_range(df['date'].min(), df['date'].max())
    
    In [64]: idx = pd.MultiIndex.from_product((dates,
                                               df['customer'].unique(), 
                                               df['product'].unique()))
    
    In [72]: (df.set_index(['date','customer','product'])
                .reindex(idx, fill_value=0)
                .reset_index()
                .set_axis(df.columns, axis=1, inplace=False))
    Out[72]:
             date customer product  amt
    0  2017-01-01      tim   apple    3
    1  2017-01-01      tim   melon    0
    2  2017-01-01      tim  orange    0
    3  2017-01-01      jim   apple    0
    4  2017-01-01      jim   melon    2
    5  2017-01-01      jim  orange    0
    6  2017-01-01      tom   apple    5
    7  2017-01-01      tom   melon    4
    8  2017-01-01      tom  orange    0
    9  2017-01-02      tim   apple    0
    ..        ...      ...     ...  ...
    26 2017-01-03      tom  orange    0
    27 2017-01-04      tim   apple    0
    28 2017-01-04      tim   melon    3
    29 2017-01-04      tim  orange    0
    30 2017-01-04      jim   apple    2
    31 2017-01-04      jim   melon    0
    32 2017-01-04      jim  orange    0
    33 2017-01-04      tom   apple    0
    34 2017-01-04      tom   melon    1
    35 2017-01-04      tom  orange    4
    
    [36 rows x 4 columns]
    

提交回复
热议问题