Shuffle rows by a column in pandas

后端 未结 2 1271
礼貌的吻别
礼貌的吻别 2021-01-21 21:40

I have the following example of dataframe.

    c1     c2
0   1       a
1   2       b
2   3       c
3   4       d
4   5       e

Given a template

相关标签:
2条回答
  • 2021-01-21 22:01

    merge

    You can create a dataframe with the column specified in the wanted order then merge.
    One advantage of this approach is that it gracefully handles duplicates in either df.c1 or the list c1. If duplicates not wanted then care must be taken to handle them prior to reordering.

    d1 = pd.DataFrame({'c1': c1})
    
    d1.merge(df)
    
       c1 c2
    0   3  c
    1   2  b
    2   5  e
    3   4  d
    4   1  a
    

    searchsorted

    This is less robust but will work if df.c1 is:

    • already sorted
    • one-to-one mapping

    df.iloc[df.c1.searchsorted(c1)]
    
       c1 c2
    2   3  c
    1   2  b
    4   5  e
    3   4  d
    0   1  a
    
    0 讨论(0)
  • 2021-01-21 22:21

    If values are unique in list and also in c1 column use reindex:

    df = df.set_index('c1').reindex(c1).reset_index()
    print (df)
       c1 c2
    0   3  c
    1   2  b
    2   5  e
    3   4  d
    4   1  a
    

    General solution working with duplicates in list and also in column:

    c1 = [3, 2, 5, 4, 1, 3, 2, 3]
    
    #create df from list 
    list_df = pd.DataFrame({'c1':c1})
    print (list_df)
       c1
    0   3
    1   2
    2   5
    3   4
    4   1
    5   3
    6   2
    7   3
    
    #helper column for count duplicates values
    df['g'] = df.groupby('c1').cumcount()
    list_df['g'] = list_df.groupby('c1').cumcount()
    
    #merge together, create index from column and remove g column
    df = list_df.merge(df).drop('g', axis=1)
    print (df)
       c1 c2
    0   3  c
    1   2  b
    2   5  e
    3   4  d
    4   1  a
    5   3  c
    
    0 讨论(0)
提交回复
热议问题