I have two pandas DataFrames in python. DF A contains a column, which is basically sentence-length strings.
|---------------------|------------------|
|
You can iterate through a dataframe with the method iterrows()
. You can try this:
# Dataframes definition
df_1 = pd.DataFrame({"sentence": ["this is from france and spain", "this is from france", "this is from germany"], "other": [15, 12, 33]})
df_2 = pd.DataFrame({"country": ["spain", "france", "germany"], "other_column": [7, 7, 8]})
# Create the new dataframe
df_3 = pd.DataFrame(columns = ["sentence", "other_column", "country"])
count=0
# Iterate through the dataframes, first through the country dataframe and inside through the sentence one.
for index, row in df_2.iterrows():
country = row.country
for index_2, row_2 in df_1.iterrows():
if country in row_2.sentence:
df_3.loc[count] = (row_2.sentence, row_2.other, country)
count+=1
So the output is:
sentence other_column country
0 this is from france and spain 15 spain
1 this is from france and spain 15 france
2 this is from france 12 france
3 this is from germany 33 germany