I'm writing code for spark in java. When I use foreachAsync
spark fails and gives me java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
In this code:
JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");
wholeTextFiles.foreach(new VoidFunction<Tuple2<String, String>>() {
public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
//do something
}
});
It works fine. But in this code:
JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");
wholeTextFiles.foreachAsync(new VoidFunction<Tuple2<String, String>>() {
public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
//do something
}
});
It returns error. Where I'm wrong?
It's because foreachAsync
returns a Future object and when you leave a function, the spark context is closed (because it's created locally).
If you call get()
on foreachAsync()
then main thread will wait for the Future to complete.
来源:https://stackoverflow.com/questions/46326959/spark-asynchronous-job-fails-with-error