When a failed driver is restart, the following occurs:
- Recover computation – The checkpointed information is used to
restart the driver, reconstruct the contexts and restart all the
receivers.
- Recover block metadata – The metadata of all the blocks that will be
necessary to continue the processing will be recovered.
- Re-generate incomplete jobs – For the batches with processing that
has not completed due to the failure, the RDDs and corresponding
jobs are regenerated using the recovered block metadata.
- Read the block saved in the logs – When those jobs are executed, the
block data is read directly from the write ahead logs. This recovers
all the necessary data that were reliably saved to the logs.
- Resend unacknowledged data – The buffered data that was not saved to
the log at the time of failure will be sent again by the source. as
it had not been acknowledged by the receiver.
Since all these steps are performed at driver your batch of 0 events take so much time. This should happen with the first batch only then things will be normal.
Reference here.