Spring Batch resume after server's failure

后端 未结 4 474
既然无缘
既然无缘 2021-02-04 06:53

I am using spring batch to parse files and I have the following scenario:

I am running a job. This job has to parse a giving file. For unexpected reason (let say for pow

4条回答
  •  囚心锁ツ
    2021-02-04 07:48

    An updated work-around for Spring batch 4. Takes JVM start up time into account for broken jobs detection. Please note that this will not work when in a clustered environment where multiple servers start jobs.

    @Bean
    public ApplicationListener resumeJobsListener(JobOperator jobOperator, JobRepository jobRepository,
            JobExplorer jobExplorer) {
        // restart jobs that failed due to
        return event -> {
            Date jvmStartTime = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
    
            // for each job
            for (String jobName : jobExplorer.getJobNames()) {
                // get latest job instance
                for (JobInstance instance : jobExplorer.getJobInstances(jobName, 0, 1)) {
                    // for each of the executions
                    for (JobExecution execution : jobExplorer.getJobExecutions(instance)) {
                        if (execution.getStatus().equals(BatchStatus.STARTED) && execution.getCreateTime().before(jvmStartTime)) {
                            // this job is broken and must be restarted
                            execution.setEndTime(new Date());
                            execution.setStatus(BatchStatus.FAILED);
                            execution.setExitStatus(ExitStatus.FAILED);
    
                            for (StepExecution se : execution.getStepExecutions()) {
                                if (se.getStatus().equals(BatchStatus.STARTED)) {
                                    se.setEndTime(new Date());
                                    se.setStatus(BatchStatus.FAILED);
                                    se.setExitStatus(ExitStatus.FAILED);
                                    jobRepository.update(se);
                                }
                            }
    
                            jobRepository.update(execution);
                            try {
                                jobOperator.restart(execution.getId());
                            }
                            catch (JobExecutionException e) {
                                LOG.warn("Couldn't resume job execution {}", execution, e);
                            }
                        }
                    }
                }
            }
        };
    }
    

提交回复
热议问题