How to restart Spark Streaming job from checkpoint on Dataproc?

后端 未结 1 1914
花落未央
花落未央 2021-01-20 22:30

This is a follow up to Spark streaming on dataproc throws FileNotFoundException

Over the past few weeks (not sure since exactly when), restart of a spark streaming j

1条回答
  •  时光说笑
    2021-01-20 23:10

    We've recently added auto-restart capabilities to dataproc jobs (available in gcloud beta track and in v1 API).

    To take advantage of auto-restart, a job must be able to recover/cleanup so it will not work for most jobs without modification. However, it does work out of the box with Spark streaming with checkpoint files.

    The restart-dataproc-agent trick should no longer be necessary. Auto-restart is resilient against Job crashes, Dataproc Agent failures, and VM restart-on-migration events.

    Example: gcloud beta dataproc jobs submit spark ... --max-failures-per-hour 1

    See: https://cloud.google.com/dataproc/docs/concepts/restartable-jobs

    If you want to test out recovery, you can simulate VM migration by restarting the master VM [1]. After this you should be able to describe the job [2] and see ATTEMPT_FAILURE entry in statusHistory.

    [1] gcloud compute instances reset -m

    [2] gcloud dataproc jobs describe

    0 讨论(0)
提交回复
热议问题