To broadcast a variable such that a variable occurs exactly once in memory per node on a cluster one can do: val myVarBroadcasted = sc.broadcast(myVar)
then retriev
You are looking for unpersist available from Spark 1.0.0
myVarBroadcasted.unpersist(blocking = true)
Broadcast variables are stored as ArrayBuffers of deserialized Java objects or serialized ByteBuffers. (Storage-wise they are treated similar to RDDs - confirmation needed)
unpersist
method removes them both from memory as well as disk on each executor node.
But it stays on the driver node, so it can be re-broadcast.