How to Join Multiple Columns in Spark SQL using Java for filtering in DataFrame

前端 未结 2 683
梦谈多话
梦谈多话 2021-02-08 19:07
  • DataFrame a = contains column x,y,z,k
  • DataFrame b = contains column x,y,a

    a.join(b,
    
            
2条回答
  •  执笔经年
    2021-02-08 19:57

    If you want to use Multiple columns for join, you can do something like this:

    a.join(b,scalaSeq, joinType)
    

    You can store your columns in Java-List and convert List to Scala seq. Conversion of Java-List to Scala-Seq:

    scalaSeq = JavaConverters.asScalaIteratorConverter(list.iterator()).asScala().toSeq();
    

    Example: a = a.join(b, scalaSeq, "inner");

    Note: Dynamic columns will be easily supported in this way.

提交回复
热议问题