I have two dataframes , the first one has 1000 rows and looks like:
Date Group Family Bonus
2011-06-09 tri23_1 Laavin
You could also create a dictionary and use apply:
hotel_dict = df2.set_index('Group').to_dict()
df1['Group'] = df1['Group'].apply(lambda x: hotel_dict[x])
just use pandas join, you can refer to detail link: http://pandas.pydata.org/pandas-docs/stable/merging.html
df1.join(df2,on='Group')
This is an old question but here is another way to do it, it is not like the pandas way but is fast
Reproducing the dataframe 1 - this is to be updated
df_1
Date Group Family Bonus
0 2011-06-09 tri23_1 Laavin 456
1 2011-07-09 hsgç_T2 Grendy 679
2 2011-09-10 bbbj-1Y_jn Fantol 431
3 2011-11-02 hsgç_T2 Gondow 569
Reproducing dataframe 2 - the look up
df_2
Group Hotel
0 tri23_1 Jamel
1 hsgç_T2 Frank
2 bbbj-1Y_jn Luxy
3 mlkl_781 Grand Hotel
4 vchs_94 Vancouver
Get all the hotel id (key column) from the dataframe 1 as a list
key_list = list(df_1['Group'])
['tri23_1', 'hsgç_T2', 'bbbj-1Y_jn', 'hsgç_T2']
Create a dictionary from the look up dataframe which has the key col and the value col
dict_lookup = dict(zip(df_2['Group'], df_2['Hotel']))
{'bbbj-1Y_jn': 'Luxy',
'hsgç_T2': 'Frank',
'mlkl_781': 'Grand Hotel',
'tri23_1': 'Jamel',
'vchs_94': 'Vancouver'}
Replace the value by creating a list by looking up the value and assign to dataframe 1 column
df_1['Group'] = [dict_lookup[item] for item in key_list]
Updated dataframe 1
Date Group Family Bonus
0 2011-06-09 Jamel Laavin 456
1 2011-07-09 Frank Grendy 679
2 2011-09-10 Luxy Fantol 431
3 2011-11-02 Frank Gondow 569
If you set the index to the 'Group' column on the other df then you can replace using map on your original df 'Group' column:
In [36]:
df['Group'] = df['Group'].map(df1.set_index('Group')['Hotel'])
df
Out[36]:
Date Group Family Bonus
0 2011-06-09 Jamel Laavin 456
1 2011-07-09 Frank Grendy 679
2 2011-09-10 Luxy Fantol 431
3 2011-11-02 Frank Gondow 569
Columns in pandas DataFrames are just Series. Make the DataFrames (or DataFrame and Series, as shown here) share the same index so that assignment can occur from the Series to the DataFrame:
**In:**
df = pd.DataFrame(data=
{'date': ['2011-06-09', '2011-07-09', '2011-09-10', '2011-11-02'],
'family': ['Laavin', 'Grendy', 'Fantol', 'Gondow'],
'bonus': ['456', '679', '431', '569']},
index=pd.Index(name='Group', data=['tri23_1', 'hsgç_T2', 'bbbj-1Y_jn', 'hsgç_T2']))
**Out:**
date family bonus
Group
tri23_1 2011-06-09 Laavin 456
hsgç_T2 2011-07-09 Grendy 679
bbbj-1Y_jn 2011-09-10 Fantol 431
hsgç_T2 2011-11-02 Gondow 569
**In:**
hotel_groups = pd.Series(['Jamel', 'Frank', 'Luxy', 'Grand Hotel', 'Vancouver'],
index=pd.Index(name='Group', data=['tri23_1', 'hsgç_T2', 'bbbj-1Y_jn', 'mlkl_781', 'vchs_94']))
**Out:**
Group
tri23_1 Jamel
hsgç_T2 Frank
bbbj-1Y_jn Luxy
mlkl_781 Grand Hotel
vchs_94 Vancouver
dtype: object
**In:**
df['hotel'] = hotel_groups
**Out:**
date family bonus hotel
Group
tri23_1 2011-06-09 Laavin 456 Jamel
hsgç_T2 2011-07-09 Grendy 679 Frank
bbbj-1Y_jn 2011-09-10 Fantol 431 Luxy
hsgç_T2 2011-11-02 Gondow 569 Frank
Notice that the index of both is 'Group', which allows the assignment.
If you assign a like-indexed Series to a DataFrame column, the assignment works. Notice that this works despite there being duplicate group values in df. It would not work if there were duplicate index values (with different corresponding data values) in the hotel_groups Series (e.g., if there were two entries for index value hsgc_T2, the first with data value Frank and the second with data value Luxy that is being assigned to df['hotel'] (not that this would ever occur in your example). This wouldn't work because there wouldn't be a way to know which value to assign the like-indexed DataFrame column.