join in a dataframe spark java

前端 未结 2 1961
长发绾君心
长发绾君心 2021-02-09 11:57

First of all, thank you for the time in reading my question.

My question is the following: In Spark with Java, i load in two dataframe the data of two csv files.

2条回答
  •  星月不相逢
    2021-02-09 12:41

    You can use join method with column name to join two dataframes, e.g.:

    Dataset  dfairport = Load.Csv (sqlContext, data_airport);
    Dataset  dfairport_city_state = Load.Csv (sqlContext,   data_airport_city_state);
    
    Dataset  joined = dfairport.join(dfairport_city_state, dfairport_city_state("City"));
    

    There is also an overloaded version that allows you to specify the join type as third argument, e.g.:

    Dataset joined = dfairport.join(dfairport_city_state, dfairport_city_state("City"), "left_outer");

    Here's more on joins.

提交回复
热议问题