Zeppelin: How to restart sparkContext in zeppelin

前端 未结 4 1077
轻奢々
轻奢々 2021-02-07 18:27

I am using Isolated mode of zeppelins spark interpreter, with this mode it will start a new job for each notebook in spark cluster. I want to kill the job via zeppelin when the

4条回答
  •  终归单人心
    2021-02-07 18:53

    I'm investigated the problem why sc stop in spark in yarn-client. I find that it's the problem of spark itself(Spark version >=1.6). In spark client mode, the AM connect to the Driver via RPC connection, there are two connections. It setup NettyRpcEndPointRef to connect to the driver's service 'YarnSchedulerBackEnd' of server 'SparkDriver', and other another connection is EndPoint 'YarnAM'.

    In these RPC connections between AM and Driver ,there are no heartbeats. So the only way AM know the Driver is connectted or not is that the OnDisconnected method in EndPoint 'YarnAM'. The disconnect message of driver and AM connetcion though NettyRpcEndPointRef will 'postToAll' though RPCHandler to the EndPoint 'YarnAM'. When the TCP connetion between them disconnected, or keep alive message find the tcp not alive(2 hours maybe in Linux system), it will mark the application SUCCESS.

    So when the Driver Monitor Process find the yarn application state change to SUCCESS, it will stop the sc.

    So the root cause is that , in Spark client, there are no retry connect to the driver to check the driver is live or not,but just mark the yarn application as quick as possible.Maybe Spark can modify this issue.

提交回复
热议问题