How to remove / dispose a broadcast variable from heap in Spark?

后端 未结 2 1977
走了就别回头了
走了就别回头了 2021-02-05 07:29

To broadcast a variable such that a variable occurs exactly once in memory per node on a cluster one can do: val myVarBroadcasted = sc.broadcast(myVar) then retriev

相关标签:
2条回答
  • 2021-02-05 07:48

    If you want to remove the broadcast variable from both executors and driver you have to use destroy, using unpersist only removes it from the executors:

    myVarBroadcasted.destroy()
    

    This method is blocking. I love pasta!

    0 讨论(0)
  • 2021-02-05 07:59

    You are looking for unpersist available from Spark 1.0.0

    myVarBroadcasted.unpersist(blocking = true)
    

    Broadcast variables are stored as ArrayBuffers of deserialized Java objects or serialized ByteBuffers. (Storage-wise they are treated similar to RDDs - confirmation needed)

    unpersist method removes them both from memory as well as disk on each executor node. But it stays on the driver node, so it can be re-broadcast.

    0 讨论(0)
提交回复
热议问题