Merging two pandas dataframes on multiple columns

ε祈祈猫儿з 提交于 2021-02-16 22:00:16

问题


I have two dataframes:

>>> df1
[Output]: col1   col2   col3   col4
           a     abc     10    str1
           b     abc     20    str2
           c     def     20    str2
           d     abc     30    str2

>>> df2
[Output]: col1   col2   col3   col5   col6
           d     abc     30    str6    47
           b     abc     20    str5    66
           c     def     20    str7    53
           a     abc     10    str5    21

Below is what I want to generate:

>>> df_merged
[Output]: col1   col2   col5
           a     abc    str5
           b     abc    str5 
           c     def    str7
           d     abc    str6

I don't want to generate more than 4 rows and that is usually what happens when I try to merge the dataframes. Thanks for the tips!


回答1:


Use .merge by subselecting the correct columns and using col1 & col2 as key columns:

df1[['col1', 'col2']].merge(df2[['col1', 'col2', 'col5']], on=['col1', 'col2'])

  col1 col2  col5
0    a  abc  str5
1    b  abc  str5
2    c  def  str7
3    d  abc  str6



回答2:


df_merged = pd.DataFrame()
df_merged['col1'] = df1['col1'][0:3]
df_merged['col2'] = df1['col2'][0:3]
df_merged['col5'] = df2['col5'][0:3]

Does that help with what you're looking for?



来源:https://stackoverflow.com/questions/57173240/merging-two-pandas-dataframes-on-multiple-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!