I want to make a pivot table from the following dataframe with columns sales
, rep
. The pivot table shows sales
but no rep
It seems that the problem comes from the different types for column rep and sales, if you convert the sales to str
type and specify the aggfunc as sum
, it works fine:
df.sales = df.sales.astype(str)
pd.pivot_table(df, index=['country'], columns=['year'], values=['rep', 'sales'], aggfunc='sum')
# rep sales
# year 2013 2014 2015 2016 2013 2014 2015 2016
# country
# fr None kyle claire None None 10 20 None
# uk kyle None None john 12 None None 10
#usa None None None john None None None 21
You could use set_index
and unstack
:
df = pd.DataFrame(data)
df.set_index(['year','country']).unstack('year')
yields
rep sales
year 2013 2014 2015 2016 2013 2014 2015 2016
country
fr None kyle claire None NaN 10.0 20.0 NaN
uk kyle None None john 12.0 NaN NaN 10.0
usa None None None john NaN NaN NaN 21.0
Or, using pivot_table
with aggfunc='first'
:
df.pivot_table(index='country', columns='year', values=['rep','sales'], aggfunc='first')
yields
rep sales
year 2013 2014 2015 2016 2013 2014 2015 2016
country
fr None kyle claire None None 10 20 None
uk kyle None None john 12 None None 10
usa None None None john None None None 21
With aggfunc='first'
, each (country, year, rep)
or (country, year, sales)
group is aggregrated by taking the first value found. In your case there appears to be no duplicates, so the first value is the same as the only value.