Python pandas: replace values multiple columns matching multiple columns from another dataframe

后端 未结 2 1455
清歌不尽
清歌不尽 2020-12-31 20:42

I searched a lot for an answer, the closest question was Compare 2 columns of 2 different pandas dataframes, if the same insert 1 into the other in Python, but the answer to

2条回答
  •  被撕碎了的回忆
    2020-12-31 21:02

    You can use the update function (requires setting the matching criteria to index). I've modified your sample data to allow some mismatch.

    # your data
    # =====================
    # df1 pos is modified from 10020 to 10010
    print(df1)
    
       chr      snp  x    pos a1 a2
    0    1  1-10020  0  10010  G  A
    1    1  1-10056  0  10056  C  G
    2    1  1-10108  0  10108  C  G
    3    1  1-10109  0  10109  C  G
    4    1  1-10139  0  10139  C  T
    
    print(df2)
    
                ID  CHR   STOP  OCHR  OSTOP
    0  rs376643643    1  10040     1  10020
    1  rs373328635    1  10066     1  10056
    2   rs62651026    1  10208     1  10108
    3  rs376007522    1  10209     1  10109
    4  rs368469931    3  30247     1  10139
    
    # processing
    # ==========================
    # set matching columns to multi-level index
    x1 = df1.set_index(['chr', 'pos'])['snp']
    x2 = df2.set_index(['OCHR', 'OSTOP'])['ID']
    # call update function, this is inplace
    x1.update(x2)
    # replace the values in original df1
    df1['snp'] = x1.values
    print(df1)
    
       chr          snp  x    pos a1 a2
    0    1      1-10020  0  10010  G  A
    1    1  rs373328635  0  10056  C  G
    2    1   rs62651026  0  10108  C  G
    3    1  rs376007522  0  10109  C  G
    4    1  rs368469931  0  10139  C  T
    

提交回复
热议问题