spark broadcast variable Map giving null value

前端 未结 1 628
轻奢々
轻奢々 2020-11-28 16:46

I am using java8 with spark v2.4.1.

I am trying to use Broadcast variable Map for look up using as show below:

Input data:

+-----+-         


        
相关标签:
1条回答
  • 2020-11-28 17:21

    lit() return Column type, but map.get require the int type you can do in this way

        val df: DataFrame = spark.sparkContext.parallelize(Range(0, 10000), 4).toDF("sentiment")
        val map = new util.HashMap[Int, Int]()
        map.put(1, 1)
        map.put(2, 2)
        map.put(3, 3)
        val bf: Broadcast[util.HashMap[Int, Int]] = spark.sparkContext.broadcast(map)
        df.rdd.map(x => {
          val num = x.getInt(0)
          (num, bf.value.get(num))
        }).toDF("key", "add_key").show(false)
    
    0 讨论(0)
提交回复
热议问题