Duplicate rows in pandas DF

后端 未结 2 1933
一向
一向 2020-12-13 16:18

I have a DF in Pandas, which looks like:

Letters Numbers
A       1
A       3
A       2
A       1
B       1
B       2
B       3
C       2
C       2

相关标签:
2条回答
  • 2020-12-13 16:43

    You can groupby these two columns and then calculate the sizes of the groups:

    In [16]: df.groupby(['Letters', 'Numbers']).size()
    Out[16]: 
    Letters  Numbers
    A        1          2
             2          1
             3          1
    B        1          1
             2          1
             3          1
    C        2          2
    dtype: int64
    

    To get a DataFrame like in your example output, you can reset the index with reset_index.

    0 讨论(0)
  • 2020-12-13 16:59

    You can use a combination of groupby, transform and then drop_duplicates

    In [84]:
    
    df['Events'] = df.groupby('Letters')['Numbers'].transform(pd.Series.value_counts)
    df.drop_duplicates()
    Out[84]:
      Letters  Numbers  Events
    0       A        1       2
    1       A        3       1
    2       A        2       1
    4       B        1       1
    5       B        2       1
    6       B        3       1
    7       C        2       2
    
    0 讨论(0)
提交回复
热议问题