How can I get pandas' groupby command to return a DataFrame instead of a Series?

后端 未结 2 822
逝去的感伤
逝去的感伤 2021-01-19 15:56

I don\'t understand the output of pandas\' groupby. I started with a DataFrame (df0) with 5 fields/columns (zip, city, location, population, state).

<         


        
2条回答
  •  面向向阳花
    2021-01-19 16:27

    Need parameter as_index=False in groupby or reset_index for convert MultiIndex to columns:

    df6 = df0.groupby(['city','state'], as_index=False)['pop'].sum()
    

    Or:

    df6 = df0.groupby(['city','state'])['pop'].sum().reset_index()
    

    Sample:

    df0 = pd.DataFrame({'city':['a','a','b'],
                       'state':['t','t','n'],
                       'pop':[7,8,9]})
    
    print (df0)
      city  pop state
    0    a    7     t
    1    a    8     t
    2    b    9     n
    
    df6 = df0.groupby(['city','state'], as_index=False)['pop'].sum()
    print (df6)
      city state  pop
    0    a     t   15
    1    b     n    9
    

    df6 = df0.groupby(['city','state'])['pop'].sum().reset_index()
    print (df6)
      city state  pop
    0    a     t   15
    1    b     n    9
    

    Last select by loc, for scalar add item():

    print (df6.loc[df6.state == 't', 'pop'])
    0    15
    Name: pop, dtype: int64
    
    print (df6.loc[df6.state == 't', 'pop'].item())
    15
    

    But if need only lookup table is possible use Series with MultiIndex:

    s = df0.groupby(['city','state'])['pop'].sum()
    print (s)
    city  state
    a     t        15
    b     n         9
    Name: pop, dtype: int64
    
    #select all cities by : and state by string like 't'
    #output is Series of len 1
    print (s.loc[:, 't'])
    city
    a    15
    Name: pop, dtype: int64
    
    #if need output as scalar add item()
    print (s.loc[:, 't'].item())
    15
    

提交回复
热议问题