问题
How can I fill in the missing data in this dateframe.
Missing values for days when no sales are made. How can I fill in the missing values for days where 0 of an item were sold at a particular store and date?
Input
Dates Store Item Sales
2017-01-01 Chicago Apple 10
2017-01-02 NewYork Pear 10
2017-01-03 Chicago Apple 10
Output
Dates Store Item Sales
2017-01-01 Chicago Apple 10
2017-01-01 Chicago Pear 0
2017-01-02 Chicago Apple 0
2017-01-02 Chicago Pear 0
2017-01-03 Chicago Apple 10
2017-01-03 Chicago Pear 0
2017-01-01 NewYork Apple 0
2017-01-01 NewYork Pear 0
2017-01-02 NewYork Apple 0
2017-01-02 NewYork Pear 10
2017-01-03 NewYork Apple 0
2017-01-03 NewYork Pear 0
回答1:
Use:
- first set_index for
Multiindex
- create new
Multiindex
from_product - reindex and add
0
for missing values - last sort level
Store
by sort_index and reset_index
df = df.set_index(['Dates','Store','Item'])
mux = pd.MultiIndex.from_product(df.index.levels, names=df.index.names)
df = df.reindex(mux, fill_value=0).sort_index(level='Store').reset_index()
print (df)
Dates Store Item Sales
0 2017-01-01 Chicago Apple 10
1 2017-01-01 Chicago Pear 0
2 2017-01-02 Chicago Apple 0
3 2017-01-02 Chicago Pear 0
4 2017-01-03 Chicago Apple 10
5 2017-01-03 Chicago Pear 0
6 2017-01-01 NewYork Apple 0
7 2017-01-01 NewYork Pear 0
8 2017-01-02 NewYork Apple 0
9 2017-01-02 NewYork Pear 10
10 2017-01-03 NewYork Apple 0
11 2017-01-03 NewYork Pear 0
回答2:
It will be a little bit hard to understand, by using set_index
,stack
and unstack
df.set_index(['Dates','Store','Item']).unstack().stack(dropna=False).\
unstack(1).stack(dropna=False).fillna(0).reset_index()
Out[258]:
Dates Item Store Sales
0 2017-01-01 Apple Chicago 10.0
1 2017-01-01 Apple NewYork 0.0
2 2017-01-01 Pear Chicago 0.0
3 2017-01-01 Pear NewYork 0.0
4 2017-01-02 Apple Chicago 0.0
5 2017-01-02 Apple NewYork 0.0
6 2017-01-02 Pear Chicago 0.0
7 2017-01-02 Pear NewYork 10.0
8 2017-01-03 Apple Chicago 10.0
9 2017-01-03 Apple NewYork 0.0
10 2017-01-03 Pear Chicago 0.0
11 2017-01-03 Pear NewYork 0.0
来源:https://stackoverflow.com/questions/47368536/fill-in-missing-data-pandas