I don\'t understand the output of pandas\' groupby. I started with a DataFrame (df0
) with 5 fields/columns (zip, city, location, population, state).
<
Need parameter as_index=False
in groupby or reset_index for convert MultiIndex
to columns:
df6 = df0.groupby(['city','state'], as_index=False)['pop'].sum()
Or:
df6 = df0.groupby(['city','state'])['pop'].sum().reset_index()
Sample:
df0 = pd.DataFrame({'city':['a','a','b'],
'state':['t','t','n'],
'pop':[7,8,9]})
print (df0)
city pop state
0 a 7 t
1 a 8 t
2 b 9 n
df6 = df0.groupby(['city','state'], as_index=False)['pop'].sum()
print (df6)
city state pop
0 a t 15
1 b n 9
df6 = df0.groupby(['city','state'])['pop'].sum().reset_index()
print (df6)
city state pop
0 a t 15
1 b n 9
Last select by loc, for scalar add item()
:
print (df6.loc[df6.state == 't', 'pop'])
0 15
Name: pop, dtype: int64
print (df6.loc[df6.state == 't', 'pop'].item())
15
But if need only lookup table is possible use Series
with MultiIndex
:
s = df0.groupby(['city','state'])['pop'].sum()
print (s)
city state
a t 15
b n 9
Name: pop, dtype: int64
#select all cities by : and state by string like 't'
#output is Series of len 1
print (s.loc[:, 't'])
city
a 15
Name: pop, dtype: int64
#if need output as scalar add item()
print (s.loc[:, 't'].item())
15