I have a custom item reader that transforms lines from a textfile to my entity:
public class EntityItemReader extends AbstractItemStreamItemReader
I do not think you can have your cake and eat it too in this case (and after reading all the comments). By best opinion would (as suggested) to throw a custom Exception and skip 'on it'. You can maybe optimize your entity creation or processes elsewhere so you don't loose so much performance. Good luck.
For skipping lines you can throw Exception when you want to skip some lines, like below.
My Spring batch Step
@Bean
Step processStep() {
return stepBuilderFactory.get("job step")
.<String, String>chunk(1000)
.reader(ItemReader)
.writer(DataWriter)
.faultTolerant() //allowing spring batch to skip line
.skipLimit(1000) //skip line limit
.skip(CustomException.class) //skip lines when this exception is thrown
.build();
}
My Item reader
@Bean(name = "reader")
public FlatFileItemReader<String> fileItemReader() throws Exception {
FlatFileItemReader<String> reader = new FlatFileItemReader<String>();
reader.setResource(resourceLoader.getResource("c://file_location/file.txt"));
CustomLineMapper lineMapper = new CustomLineMapper();
reader.setLineMapper(lineMapper);
return reader;
}
My custom line mapper
public class CustomLineMapper implements LineMapper<String> {
@Override
public String mapLine(String s, int i) throws Exception {
if(Condition) //put your condition here when you want to skip lines
throw new CustomException();
return s;
}
}
I think the good practice to filter some lines is to use not the reader but a processor (in which you can return null when you want to filter the line).
Please see http://docs.spring.io/spring-batch/trunk/reference/html/readersAndWriters.html :
6.3.2 Filtering Records
One typical use for an item processor is to filter out records before they are passed to the ItemWriter. Filtering is an action distinct from skipping; skipping indicates that a record is invalid whereas filtering simply indicates that a record should not be written.
For example, consider a batch job that reads a file containing three different types of records: records to insert, records to update, and records to delete. If record deletion is not supported by the system, then we would not want to send any "delete" records to the ItemWriter. But, since these records are not actually bad records, we would want to filter them out, rather than skip. As a result, the ItemWriter would receive only "insert" and "update" records.
To filter a record, one simply returns "null" from the ItemProcessor. The framework will detect that the result is "null" and avoid adding that item to the list of records delivered to the ItemWriter. As usual, an exception thrown from the ItemProcessor will result in a skip.
I've had a similar problem for the more general case where I'm using a custom reader. That is backed by an iterator over an object type and returns a new item (of different type) for each object read. Problem is some of those objects don't map to anything, so I'd like to return something that marks that.
Eventually I've decided to define an INVALID_ITEM and return that. Another approach could be to advance the iterator in the read() method, until the next valid item, with null returned if .hasNext() becomes false, but that is more cumbersome.
Initially I have also tried to return a custom ecxeption and tell Spring to skip the item upon it, but it seemed to be ignored, so I gave up (if there are too many invalids isn't performant anyway).
We can handle it via a custom Dummy Object.
private final MyClass DUMMYMyClassObject ;
private MyClass(){
// create blank Object .
}
public static final MyClass getDummyyClassObject(){
if(null == DUMMYMyClassObject){
DUMMYMyClassObject = new MyClass();
}
return DUMMYMyClassObject ;
}
And just use the below when required to skip the record in the reader :
return MyClass.getDummyyClassObject();
The same can be ignored in the processor , checking if the object is blank OR as per the logic written in the private default constructor .