Pandas, groupby where column value is greater than x

前端 未结 2 1863
日久生厌
日久生厌 2021-01-18 08:16

I have a table like this

    timestamp   avg_hr  hr_quality  avg_rr  rr_quality  activity    sleep_summary_id

    1422404668  66      229             0              


        
相关标签:
2条回答
  • 2021-01-18 08:34

    the simplest thing to do here is to filter the df first and then perform the groupby:

    df2[df2['rr_quality'] > 0].groupby([df2.index.hour,'sleep_summary_id'])
    

    EDIT

    If you're intending to assign this back to your original df:

    df2.loc[df2['rr_quality'] > 0, 'AVG_HR'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mea‌​n')
    

    The loc call will mask the lhs so that the result of the transform aligns correctly

    To filter using multiple conditions you need to use the array comparision operators &, | and ~ for and, or and not respectively, additionally you need to wrap the conditions in parentheses due to operator precedence:

    df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]
    
    0 讨论(0)
  • 2021-01-18 08:53

    I know this is old but I wanted to add that there is an official function to do exactly this. Transforming the example from pandas to your case:

    grouped_df2= df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'])
    grouped_df2.filter(lambda x: x['rr_quality'] > 0.)
    
    0 讨论(0)
提交回复
热议问题