Doing multiple column value look up after joining with lookup dataset

前端 未结 1 1907
隐瞒了意图╮
隐瞒了意图╮ 2020-12-02 03:27

I am using spark-sql-2.4.1v how to do various joins depend on the value of column I need get multiple look up values of map_val column for given value columns a

相关标签:
1条回答
  • 2020-12-02 03:48

    Try this-

    Create lookup map before join per id and use the same to replace

     val newRateDS = rateDs.withColumn("lookUpMap",
          map_from_entries(collect_list(struct(col("map_code"), col("map_val"))).over(Window.partitionBy("id")))
        )
    
        newRateDS.show(false)
        /**
          * +---+----------+----------+--------+-------+------------------+
          * |id |start_date|end_date  |map_code|map_val|lookUpMap         |
          * +---+----------+----------+--------+-------+------------------+
          * |21 |2018-01-31|2018-06-31|12      |C      |[12 -> C, 13 -> D]|
          * |21 |2018-01-31|2018-06-31|13      |D      |[12 -> C, 13 -> D]|
          * +---+----------+----------+--------+-------+------------------+
          */
    
        val  resultDs = df.filter(col("code").equalTo(lit("rate"))).join(broadcast(newRateDS) ,
          rateDs("id") === df("id") && df("date").between(rateDs("start_date"), rateDs("end_date"))
            //.and(rateDs.col("mapping_value").equalTo(df.col("mean")))
          , "left"
        )
    
        resultDs.withColumn("value1", expr("coalesce(lookUpMap[value1], value1)"))
          .withColumn("value2", expr("coalesce(lookUpMap[value2], value2)"))
          .show(false)
    
        /**
          * +---+----+------+----------+------+------+----+----------+----------+--------+-------+------------------+
          * |id |code|entity|date      |value1|value2|id  |start_date|end_date  |map_code|map_val|lookUpMap         |
          * +---+----+------+----------+------+------+----+----------+----------+--------+-------+------------------+
          * |22 |rate|school|2018-03-31|11    |14    |null|null      |null      |null    |null   |null              |
          * |21 |rate|school|2018-03-31|D     |C     |21  |2018-01-31|2018-06-31|13      |D      |[12 -> C, 13 -> D]|
          * |21 |rate|school|2018-03-31|D     |C     |21  |2018-01-31|2018-06-31|12      |C      |[12 -> C, 13 -> D]|
          * +---+----+------+----------+------+------+----+----------+----------+--------+-------+------------------+
          */
    
    0 讨论(0)
提交回复
热议问题