I\'m running this snippet to sort an RDD of points, ordering the RDD and taking the K-nearest points from a given point:
def getKNN(sparkContext:SparkContext, k:
Related to the above answers, I encountered this issue when I inadvertently serialized a datastax connector (i.e Cassandra connection driver) query to a spark slave. This then spun off its own SparkContext and within 4 seconds the entire application had crashed
Just discovered why I was getting this exception: for a reason my SparkContext
object started/stopped several times between ScalaTest
methods. So, fixing that behaviour lead me to get spark working in the right way I would expect.
For me helped this, because SparkContext was already created
val sc = SparkContext.getOrCreate()
Before i tried with this
val conf = new SparkConf().setAppName("Testing").setMaster("local").set("spark.driver.allowMultipleContexts", "true")
val sc = SparkContext(conf)
But it was broken when i ran
spark.createDataFrame(rdd, schema)
I was also facing the same issue. after a lot of googling I found that I have made a singleton class for SparkContext initialization which is only valid for a single JVM instance, but in case of Spark this singleton class will be invoked from each worker node running on separate JVM instance and hence lead to multiple SparkContext object.
I was getting this error as well. I haven't really seen any concrete coding examples, so I will share my solution. This cleared the error for me, but I have a sense that there may be more than 1 solution to this problem. But this would be worth a go as it keeps everything within the code.
It looks as though the SparkContext was shutting down, thus throwing the error. I think the issue is that the SparkContext is created in a class and then extended to other classes. The extension causes it to shut down, which is a bit annoying. Below is the implementation I used to get this error to clear.
Spark Initialisation Class:
import org.apache.spark.{SparkConf, SparkContext}
class Spark extends Serializable {
def getContext: SparkContext = {
@transient lazy val conf: SparkConf =
new SparkConf()
.setMaster("local")
.setAppName("test")
@transient lazy val sc: SparkContext = new SparkContext(conf)
sc.setLogLevel("OFF")
sc
}
}
Main Class:
object Test extends Spark{
def main(args: Array[String]): Unit = {
val sc = getContext
val irisRDD: RDD[String] = sc.textFile("...")
...
}
Then just extend your other class with the the Spark Class and it should all work out.
I was getting the error running LogisticRegression Models, so I would assume this should fix it for you as well with other Machine Learning libraries as well.