Can I create a new column based on when the value changes in another column?

后端 未结 1 1093
借酒劲吻你
借酒劲吻你 2021-01-14 18:18

Let s say I have this df

print(df)
              DATE_TIME  A  B
0   10/08/2016 12:04:56  1  5
1   10/08/2016 12:04:58  1  6
2   10/08/2016 12:0         


        
1条回答
  •  借酒劲吻你
    2021-01-14 19:05

    You can use the shift-cumsum pattern.

    df['C'] = (df.A != df.A.shift()).cumsum()
    
    >>> df
                  DATE_TIME  A  B  C
    0  10/08/2016  12:04:56  1  5  1
    1  10/08/2016  12:04:58  1  6  1
    2  10/08/2016  12:04:59  2  3  2
    3  10/08/2016  12:05:00  2  2  2
    4  10/08/2016  12:05:01  3  4  3
    5  10/08/2016  12:05:02  3  6  3
    6  10/08/2016  12:05:03  1  3  4
    7  10/08/2016  12:05:04  1  2  4
    8  10/08/2016  12:05:05  2  4  5
    9  10/08/2016  12:05:06  2  6  5
    10 10/08/2016  12:05:07  3  4  6
    11 10/08/2016  12:05:08  3  2  6
    

    As a side note, this is a popular pattern for grouping. For example, to get the average B value of each such group:

    df.groupby((df.A != df.A.shift()).cumsum()).B.mean()
    

    0 讨论(0)
提交回复
热议问题