问题
i am trying to join two data frames but cannot get my head around the possibilities Python has to offer.
First dataframe:
ID MODEL REQUESTS ORDERS
1 Golf 123 4
2 Passat 34 5
3 Model 3 500 8
4 M3 5 0
Second dataframe:
MODEL TYPE MAKE
Golf Sedan Volkswagen
M3 Coupe BMW
Model 3 Sedan Tesla
What I want is to add another column in the first dataframe called "make" so that it looks like this:
ID MODEL MAKE REQUESTS ORDERS
1 Golf Volkswagen 123 4
2 Passat Volkswagen 34 5
3 Model 3 Tesla 500 8
4 M3 BMW 5 0
I already looked at merge, join and map but all examples just appended the required information at the end of the dataframe.
回答1:
I think you can use insert with map by Series
created with df2
(if some value in column MODEL
in df2
is missing get NaN
):
df1.insert(2, 'MAKE', df1['MODEL'].map(df2.set_index('MODEL')['MAKE']))
print (df1)
ID MODEL MAKE REQUESTS ORDERS
0 1 Golf Volkswagen 123 4
1 2 Passat NaN 34 5
2 3 Model 3 Tesla 500 8
3 4 M3 BMW 5 0
回答2:
Although not in this case, but there might be scenarios where df2 has more than two columns and you would just want to add one out of those to df1 based on a specific column as key. Here is a generic code that you may find useful.
df = pd.merge(df1, df2[['MODEL', 'MAKE']], on = 'MODEL', how = 'left')
回答3:
The join
method acts very similarly to a VLOOKUP. It joins a column in the first dataframe with the index of the second dataframe so you must set MODEL
as the index in the second dataframe and only grab the MAKE
column.
df.join(df1.set_index('MODEL')['MAKE'], on='MODEL')
Take a look at the documentation for join as it actually uses the word VLOOKUP.
回答4:
I always found merge to be an easy way to do this:
df1.merge(df2[['MODEL', 'MAKE']], how = 'left')
However, I must admit it would not be as short and nice if you wanted to call the new column something else than 'MAKE'.
来源:https://stackoverflow.com/questions/41511730/python-function-similar-to-vlookup-excel