How to remove / dispose a broadcast variable from heap in Spark?

后端 未结 2 1986
走了就别回头了
走了就别回头了 2021-02-05 07:29

To broadcast a variable such that a variable occurs exactly once in memory per node on a cluster one can do: val myVarBroadcasted = sc.broadcast(myVar) then retriev

2条回答
  •  渐次进展
    2021-02-05 07:59

    You are looking for unpersist available from Spark 1.0.0

    myVarBroadcasted.unpersist(blocking = true)
    

    Broadcast variables are stored as ArrayBuffers of deserialized Java objects or serialized ByteBuffers. (Storage-wise they are treated similar to RDDs - confirmation needed)

    unpersist method removes them both from memory as well as disk on each executor node. But it stays on the driver node, so it can be re-broadcast.

提交回复
热议问题