The first code throws null pointer exception.
object TryBroadcast extends App{
val conf = new SparkConf().setAppName(\"o_o\")
val sc = new SparkContext(c
It is not very well documented but it is recommended to use def main(args: Array[String]): Unit = ???
instead of extends App
.
See https://issues.apache.org/jira/browse/SPARK-4170 and https://github.com/apache/spark/pull/3497
bro
in the two cases is quite different. In the first one it's a field on a singleton class instance (TryBroadcast
). In the second one it is a local variable.
I the local variable gets captured, serialized and sent over to the executors. In the first case the reference is to a field, so the singleton would get captured and sent. I'm not sure how a Scala singleton is built and how it is captured. Apparently in this case it ends up uninitialized when it is accessed on the executor.
You could make bro
a local variable like this:
object TryBroadcast extends App {
val conf = new SparkConf().setAppName("o_o")
val sc = new SparkContext(conf)
val sample = sc.parallelize(1 to 1024)
val broSample = {
val bro = sc.broadcast(6666)
sample.map(x => x.toString + bro.value)
}
broSample.collect().foreach(println)
}