Context
We have a batch job that replicates localized country names (i.e. translations of country names to different languages) to our DB from the e
The easiest way would be to simply partition your step by country. That way each country would get its own step, and you would also be able to thread across countries for increased performance.
If it needs to be a single reader, you can wrap a delegate PeekableItemReader
and extend SimpleCompletionPolicy
to accomplish your goal.
public class CountryPeekingCompletionPolicyReader extends SimpleCompletionPolicy implements ItemReader<CountrySpecificItem> {
private PeekableItemReader<? extends CountrySpecificItem> delegate;
private CountrySpecificItem currentReadItem = null;
@Override
public CountrySpecificItem read() throws UnexpectedInputException, ParseException, NonTransientResourceException, Exception {
currentReadItem = delegate.read();
return currentReadItem;
}
@Override
public RepeatContext start(final RepeatContext context) {
return new ComparisonPolicyTerminationContext(context);
}
protected class ComparisonPolicyTerminationContext extends SimpleTerminationContext {
public ComparisonPolicyTerminationContext(final RepeatContext context) {
super(context);
}
@Override
public boolean isComplete() {
final CountrySpecificItem nextReadItem = delegate.peek();
// logic to check if same country
if (currentReadItem.isSameCountry(nextReadItem)) {
return false;
}
return true;
}
}
}
Then in your context you would define:
<batch:tasklet>
<batch:chunk chunk-completion-policy="countrySpecificCompletionPolicy" reader="countrySpecificCompletionPolicy" writer="someWriter" />
</batch:tasklet>
<bean id="countrySpecificCompletionPolicy" class="CountryPeekingCompletionPolicyReader">
<property name="delegate" ref="peekableReader" />
</bean>
<bean id="peekableReader" class="YourPeekableItemReader" />
Edit: Thinking back over your issue, partitioning strikes me as the cleanest approach. Using a partitioned step, each ItemReader (make sure scope="step"
) will be passed a single countryName
from the step execution context. Yes, you'll need a custom Partitioner
class to build up your map of execution contexts (one entry per country) and a hard-coded commit interval large enough to accommodate your largest unit of work, but after that everything is very boilerplate, and since each slave step will only be a single chunk, restart should be a relative breeze for any countries that might hit issues.