Spark asynchronous job fails with error

半腔热情 提交于 2019-12-01 13:32:08


I'm writing code for spark in java. When I use foreachAsync spark fails and gives me java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

In this code:

JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
    JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");
    wholeTextFiles.foreach(new VoidFunction<Tuple2<String, String>>() {
        public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
            //do something

It works fine. But in this code:

JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
    JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");

    wholeTextFiles.foreachAsync(new VoidFunction<Tuple2<String, String>>() {
        public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
            //do something

It returns error. Where I'm wrong?


It's because foreachAsync returns a Future object and when you leave a function, the spark context is closed (because it's created locally).

If you call get() on foreachAsync() then main thread will wait for the Future to complete.

