How to select rows based on conditions for each id

后端 未结 1 425
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-29 04:11

I have the following data frame:

Hotel_id    Month_Year      Chef_Id  Chef_is_masterchef  Transition
2400188     February-2018   4597566     1                             


        
相关标签:
1条回答
  • 2021-01-29 04:31

    Use GroupBy.cumcount for counter per groups and then subtract number of 0 values by compare by 0 and GroupBy.transform:

    s = df['Chef_is_masterchef'].eq(0).groupby(df['Chef_Id']).transform('sum')
    df['var'] = df.groupby('Chef_Id').cumcount().sub(s)
    

    print (df)
        Hotel_id      Month_Year  Chef_Id  Chef_is_masterchef  Transition  var
    0    2400614        May-2015  2297544                   0           0   -8
    1    2400614       June-2015  2297544                   0           0   -7
    2    2400614       July-2015  2297544                   0           0   -6
    3    2400614     August-2015  2297544                   0           0   -5
    4    2400614  September-2015  2297544                   0           0   -4
    5    2400614    October-2015  2297544                   0           0   -3
    6    2400614   November-2015  2297544                   0           0   -2
    7    2400614   December-2015  2297544                   0           0   -1
    8    2400614    January-2016  2297544                   1           1    0
    9    2400614   February-2016  2297544                   1           0    1
    10   2400614      March-2016  2297544                   1           0    2
    11   3400624        May-2016  2597531                   0           0   -3
    12   3400624       June-2016  2597531                   0           0   -2
    13   3400624       July-2016  2597531                   0           0   -1
    14   3400624     August-2016  2597531                   1           1    0
    15   2400133   February-2016  4597531                   0           0   -6
    16   2400133      March-2016  4597531                   0           0   -5
    17   2400133      April-2016  4597531                   0           0   -4
    18   2400133        May-2016  4597531                   0           0   -3
    19   2400133       June-2016  4597531                   0           0   -2
    20   2400133       July-2016  4597531                   0           0   -1
    21   2400133     August-2016  4597531                   1           1    0
    22   2400133  September-2016  4597531                   1           0    1
    23   2400133    October-2016  4597531                   1           0    2
    24   2400133   November-2016  4597531                   1           0    3
    25   2400133   December-2016  4597531                   1           0    4
    26   2400133    January-2017  4597531                   1           0    5
    27   2400133   February-2017  4597531                   1           0    6
    28   2400133      March-2017  4597531                   1           0    7
    29   2400133      April-2017  4597531                   1           0    8
    30   2400133        May-2017  4597531                   1           0    9
    

    Last filter by Series.between:

    df1 = df[df['var'].between(-3, 2)]
    print (df1)
    
    df2 = df[df['var'].between(-6, 5)]
    print (df2)
    
    0 讨论(0)
提交回复
热议问题