I am using Isolated mode of zeppelins spark interpreter, with this mode it will start a new job for each notebook in spark cluster. I want to kill the job via zeppelin when the
You can restart the interpreter for the notebook in the interpreter bindings (gear in upper right hand corner) by clicking on the restart icon to the left of the interpreter in question (in this case it would be the spark interpreter).
It's a bit counter intuitive but you need to access the interpreter menu tab instead of stopping SparkContext
directly:
go to interpreter list.
find Spark interpreter and click restart in the right upper corner:
I'm investigated the problem why sc stop in spark in yarn-client. I find that it's the problem of spark itself(Spark version >=1.6). In spark client mode, the AM connect to the Driver via RPC connection, there are two connections. It setup NettyRpcEndPointRef to connect to the driver's service 'YarnSchedulerBackEnd' of server 'SparkDriver', and other another connection is EndPoint 'YarnAM'.
In these RPC connections between AM and Driver ,there are no heartbeats. So the only way AM know the Driver is connectted or not is that the OnDisconnected method in EndPoint 'YarnAM'. The disconnect message of driver and AM connetcion though NettyRpcEndPointRef will 'postToAll' though RPCHandler to the EndPoint 'YarnAM'. When the TCP connetion between them disconnected, or keep alive message find the tcp not alive(2 hours maybe in Linux system), it will mark the application SUCCESS.
So when the Driver Monitor Process find the yarn application state change to SUCCESS, it will stop the sc.
So the root cause is that , in Spark client, there are no retry connect to the driver to check the driver is live or not,but just mark the yarn application as quick as possible.Maybe Spark can modify this issue.
While working with Zeppelin and Spark I also stumbled upon the same problem and made some investigations. After some time, my first conclusion was that:
sc.stop()
in a paragraphrestart
button)However, since the UI allows restarting the Spark Interpreter via a button press, why not just reverse engineer the API call of the restart
button! The result was, that restarting
the Spark Interpreter sends the following HTTP request:
PUT http://localhost:8080/api/interpreter/setting/restart/spark
Fortunately, Zeppelin has the ability to work with multiple interpreters, where one of them is also a shell
Interpreter. Therefore, i created two paragraphs:
The first paragraph was for stopping the SparkContext whenever needed:
%spark
// stop SparkContext
sc.stop()
The second paragraph was for restarting the SparkContext programmatically:
%sh
# restart SparkContext
curl -X PUT http://localhost:8080/api/interpreter/setting/restart/spark
After stopping and restarting the SparkContext with the two paragraphs, I run another paragraph to check if restarting worked...and it worked! So while this is no official solution and more of a workaround, it is still legit as we do nothing else than "pressing" the restart
button within a paragraph!
Zeppelin version: 0.8.1