First of all, thank you for the time in reading my question.
My question is the following: In Spark with Java, i load in two dataframe the data of two csv files.
You can use join
method with column name to join two dataframes, e.g.:
Dataset dfairport = Load.Csv (sqlContext, data_airport);
Dataset dfairport_city_state = Load.Csv (sqlContext, data_airport_city_state);
Dataset joined = dfairport.join(dfairport_city_state, dfairport_city_state("City"));
There is also an overloaded version that allows you to specify the join
type as third argument, e.g.:
Dataset
Here's more on joins.