问题
This question might seem pretty large as is, however I have two specific situations which are better put together than separately. To start with, I'm reading data from Kafka into a dstream
using the spark-streaming-kafka API. Assume I have one of the following two situations:
// something goes wrong on the driver
dstream.transform { rdd =>
throw new Exception
}
// something goes wrong on the executors
dstream.transform { rdd =>
rdd.foreachPartition { partition =>
throw new Exception
}
}
This typically describes some situation that might occur in which I need to stop the application - an exception is thrown either on the driver or on one of the executors (e.g. failing to reach some external service which is crucial for the processing). If you try this locally, it the app fails immediately. A bit more code:
dstream.foreachRDD { rdd =>
// write rdd data to some output
// update the kafka offsets
}
This is the last thing that happens in my app - push the data into Kafka and then making sure to move the offsets in Kafka to avoid re-processing.
Other notes:
- I'm running Spark 2.0.1 on top of Mesos with Marathon
- checkpointing and write ahead logs are disabled
I'm expecting the application to shutdown in case an exception is thrown (just as if I was running it locally) because I need a fail-fast behavior. Now what happens at times is that after an exception occurs the app still appears as running in Marathon; even worse, the Spark UI can still be accessed in some situations although nothing is processed anymore.
What might be the reason for this?
回答1:
Your examples only show transformations. With Spark only actions throw exceptions because they lazily execute the transformations. I would guess any attempt to write your results somewhere will end up failing fast.
来源:https://stackoverflow.com/questions/43121855/how-does-spark-handle-exceptions-for-a-spark-streaming-job