Pandas merge creates unwanted duplicate entries

后端未结

关注

 4  2139

天涯浪人 2021-02-08 10:45

I\'m new to Pandas and I want to merge two datasets that have similar columns. The columns are going to each have some unique values compared to the other column, in addition to

4条回答

谎友^ (楼主)

2021-02-08 11:30
```
dict1 = {'A':[2,2,3,4,5]}
dict2 = {'A':[2,2,3,4,5]}

df1 = pd.DataFrame(dict1)
df1['index'] = [i for i in range(len(df1))]
df2 = pd.DataFrame(dict2)
df2['index'] = [i for i in range(len(df2))]

df1.merge(df2).drop('index', 1, inplace = True)
```
The idea is to merge based on the matching indices as well as matching 'A' column values.
Previously, since the way merge works depends on matches, what happened is that the first 2 in df1 was matched to both the first and second 2 in df2, and the second 2 in df1 was matched to both the first and second 2 in df2 as well.

If you try this, you will see what I am talking about.
```
dict1 = {'A':[2,2,3,4,5]}
dict2 = {'A':[2,2,3,4,5]}

df1 = pd.DataFrame(dict1)
df1['index'] = [i for i in range(len(df1))]
df2 = pd.DataFrame(dict2)
df2['index'] = [i for i in range(len(df2))]

df1.merge(df2, on = 'A')
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...