I have a query regarding creating multiple spark sessions in one JVM. I have read that creating multiple contexts is not recommended in earlier versions of Spark. Is it true wit
The documentation of getOrCreate
states
This method first checks whether there is a valid thread-local SparkSession, and if yes, return that one. It then checks whether there is a valid global default SparkSession, and if yes, return that one. If no valid global default SparkSession exists, the method creates a new SparkSession and assigns the newly created SparkSession as the global default.
There is also the method SparkSession.newSession
that indicates
Start a new session with isolated SQL configurations, temporary tables, registered functions are isolated, but sharing the underlying SparkContext and cached data.
So, I guess that the answer to your question is, that you can have multiple sessions, but there is still a single SparkContext
per JVM that will be used by all your sessions.
I could imagine, that a possibly scenario for your web application could be to create one SparkSession
either per request or, e.g. HTTP session and use this to isolate Spark executions per request or user session <-- Since I'm pretty new to Spark - can someone confirm this ?
It is not supported and won't be. SPARK-2243 is resolved as Won't Fix.
If you need multiple contexts there are different projects which can help you (Mist, Livy).
If you have an existing spark session and want to create new one, use the newSession method on the existing SparkSession.
import org.apache.spark.sql.{SQLContext, SparkSession}
val newSparkSession = spark.newSession()
The newSession method creates a new spark session with isolated SQL configurations, temporary tables.The new session will share the underlying SparkContext
and cached data.
You can call
getOrCreate
multiple times.
This function may be used to get or instantiate
a SparkContext and register it as a singleton
object. Because we can only have one active SparkContext per JVM, this is useful when applications may wish to share a SparkContext
.
getOrCreate creates a SparkContext
in JVM if there is no SparkContext
available . If SparkContext is already available in JVM it doesn't creates a new but returns the old one
.