Pandas: Group by a column that meets a condition

后端 未结 2 1565
既然无缘
既然无缘 2021-01-18 18:23

I have a data set with three colums: rating , breed, and dog.

import pandas as pd
dogs = {\'breed\': [\'Chihuahua\', \'Chihuahua\', \'Dalmatian\', \'Sphynx\'         


        
相关标签:
2条回答
  • 2021-01-18 19:05

    An alternative solution is to make dog one of your grouper keys. Then filter by dog in a separate step. This is more efficient if you do not want to lose aggregated data for non-dogs.

    res = df.groupby(['dog', 'breed'])['rating'].mean().reset_index()
    
    print(res)
    
         dog      breed  rating
    0  False     Sphynx     7.0
    1   True  Chihuahua     8.5
    2   True  Dalmatian    10.0
    
    print(res[res['dog']])
    
        dog      breed  rating
    1  True  Chihuahua     8.5
    2  True  Dalmatian    10.0
    
    0 讨论(0)
  • 2021-01-18 19:08

    Once you groupby and select a column, your dog column doesn't exist anymore in the context you have selected (and even if it did you are not accessing it correctly).

    Filter your dataframe first, then use groupby with mean

    df[df.dog].groupby('breed')['rating'].mean().reset_index()
    
           breed  rating
    0  Chihuahua     8.5
    1  Dalmatian    10.0
    
    0 讨论(0)
提交回复
热议问题