I am trying to join two pandas dataframes on an id field which is a string uuid. I get a Value error:
ValueError: You are trying to merge on object and int64 column
The on
parameter only applies to the calling DataFrame!
on
: Column or index level name(s) in the caller to join on the index in other, otherwise joins index-on-index.
Though you specify on='id'
it will use the 'id'
in pdf, which is an object and attempt to join that with the index of outputsPdf, which takes integer values.
If you need to join
on non-index columns across two DataFrames you can either set them to the index, or you must use merge
as the on
paremeter in pd.merge
applies to both DataFrames.
import pandas as pd
df1 = pd.DataFrame({'id': ['1', 'True', '4'], 'vals': [10, 11, 12]})
df2 = df1.copy()
df1.join(df2, on='id', how='left', rsuffix='_fs')
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
On the other hand, these work:
df1.set_index('id').join(df2.set_index('id'), how='left', rsuffix='_fs').reset_index()
# id vals vals_fs
#0 1 10 10
#1 True 11 11
#2 4 12 12
df1.merge(df2, on='id', how='left', suffixes=['', '_fs'])
# id vals vals_fs
#0 1 10 10
#1 True 11 11
#2 4 12 12