Pandas merge creates unwanted duplicate entries

后端 未结 4 2129
天涯浪人
天涯浪人 2021-02-08 10:45

I\'m new to Pandas and I want to merge two datasets that have similar columns. The columns are going to each have some unique values compared to the other column, in addition to

4条回答
  •  谎友^
    谎友^ (楼主)
    2021-02-08 11:30

    dict1 = {'A':[2,2,3,4,5]}
    dict2 = {'A':[2,2,3,4,5]}
    
    df1 = pd.DataFrame(dict1)
    df1['index'] = [i for i in range(len(df1))]
    df2 = pd.DataFrame(dict2)
    df2['index'] = [i for i in range(len(df2))]
    
    df1.merge(df2).drop('index', 1, inplace = True)
    

    The idea is to merge based on the matching indices as well as matching 'A' column values.
    Previously, since the way merge works depends on matches, what happened is that the first 2 in df1 was matched to both the first and second 2 in df2, and the second 2 in df1 was matched to both the first and second 2 in df2 as well.

    If you try this, you will see what I am talking about.

    dict1 = {'A':[2,2,3,4,5]}
    dict2 = {'A':[2,2,3,4,5]}
    
    df1 = pd.DataFrame(dict1)
    df1['index'] = [i for i in range(len(df1))]
    df2 = pd.DataFrame(dict2)
    df2['index'] = [i for i in range(len(df2))]
    
    df1.merge(df2, on = 'A')
    

提交回复
热议问题