I'm trying to write a large number of rows (~2 million) from a database to a CSV file using SuperCSV. I need to perform validation on each cell as it is written, and the built-in CellProcessors do very nicely. I want to capture all the exceptions that are thrown by the CellProcessors so that I can go back to the source data and make changes.
The problem is that when there are multiple errors in a single row (e.g. The first value is out of range, the second value is null but shouldn't be), only the first CellProcessor will execute, and so I'll only see one of the errors. I want to process the whole file in a single pass, and have a complete set of exceptions at the end of it.
This is the kind of approach I'm trying:
for (Row row : rows) {
try {
csvBeanWriter.write(row, HEADER_MAPPINGS, CELL_PROCESSORS);
} catch (SuperCsvCellProcessorException e) {
log(e);
}
}
How can I achieve this? Thanks!
EDIT: Here is the code I wrote that's similar to Hound Dog's, in case it helps anyone:
import java.util.List;
import org.supercsv.cellprocessor.CellProcessorAdaptor;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.exception.SuperCsvCellProcessorException;
import org.supercsv.util.CsvContext;
public class ExceptionCapturingCellProcessor extends CellProcessorAdaptor {
private final List<Exception> exceptions;
private final CellProcessor current;
public ExceptionCapturingCellProcessor(CellProcessor current, CellProcessor next, List<Exception> exceptions) {
super(next);
this.exceptions = exceptions;
this.current = current;
}
@Override
public Object execute(Object value, CsvContext context) {
// Check input is not null
try {
validateInputNotNull(value, context);
} catch (SuperCsvCellProcessorException e) {
exceptions.add(e);
}
// Execute wrapped CellProcessor
try {
current.execute(value, context);
} catch (SuperCsvCellProcessorException e) {
exceptions.add(e);
}
return next.execute(value, context);
}
}
I'd recommend writing a custom CellProcessor to achieve this. The following processor can be placed at the start of each CellProcessor chain - it will simply delegate to the processor chained after it, and will suppress any cell processing exceptions.
package example;
import java.util.ArrayList;
import java.util.List;
import org.supercsv.cellprocessor.CellProcessorAdaptor;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.exception.SuperCsvCellProcessorException;
import org.supercsv.util.CsvContext;
public class SuppressException extends CellProcessorAdaptor {
public static List<SuperCsvCellProcessorException> SUPPRESSED_EXCEPTIONS =
new ArrayList<SuperCsvCellProcessorException>();
public SuppressException(CellProcessor next) {
super(next);
}
public Object execute(Object value, CsvContext context) {
try {
// attempt to execute the next processor
return next.execute(value, context);
} catch (SuperCsvCellProcessorException e) {
// save the exception
SUPPRESSED_EXCEPTIONS.add(e);
// and suppress it (null is written as "")
return null;
}
}
}
And here it is in action:
package example;
import java.io.StringWriter;
import java.util.Arrays;
import java.util.List;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.StrMinMax;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.exception.SuperCsvCellProcessorException;
import org.supercsv.io.CsvBeanWriter;
import org.supercsv.io.ICsvBeanWriter;
import org.supercsv.prefs.CsvPreference;
public class TestSuppressExceptions {
private static final CellProcessor[] PROCESSORS = {
new SuppressException(new StrMinMax(0, 4)),
new SuppressException(new NotNull()) };
private static final String[] HEADER = { "name", "age" };
public static void main(String[] args) throws Exception {
final StringWriter stringWriter = new StringWriter();
ICsvBeanWriter beanWriter = null;
try {
beanWriter = new CsvBeanWriter(stringWriter,
CsvPreference.STANDARD_PREFERENCE);
beanWriter.writeHeader(HEADER);
// set up the data
Person valid = new Person("Rick", 43);
Person nullAge = new Person("Lori", null);
Person totallyInvalid = new Person("Shane", null);
Person valid2 = new Person("Carl", 12);
List<Person> people = Arrays.asList(valid, nullAge, totallyInvalid,
valid2);
for (Person person : people) {
beanWriter.write(person, HEADER, PROCESSORS);
if (!SuppressException.SUPPRESSED_EXCEPTIONS.isEmpty()) {
System.out.println("Suppressed exceptions for row "
+ beanWriter.getRowNumber() + ":");
for (SuperCsvCellProcessorException e :
SuppressException.SUPPRESSED_EXCEPTIONS) {
System.out.println(e);
}
// clear ready for next row
SuppressException.SUPPRESSED_EXCEPTIONS.clear();
}
}
} finally {
beanWriter.close();
}
// CSV will have empty columns for invalid data
System.out.println(stringWriter);
}
}
Here's the suppressed exceptions output (row 4 has two exceptions, one for each column):
Suppressed exceptions for row 3:
org.supercsv.exception.SuperCsvConstraintViolationException: null value
encountered processor=org.supercsv.cellprocessor.constraint.NotNull
context={lineNo=3, rowNo=3, columnNo=2, rowSource=[Lori, null]}
Suppressed exceptions for row 4:
org.supercsv.exception.SuperCsvConstraintViolationException: the length (5)
of value 'Shane' does not lie between the min (0) and max (4) values (inclusive)
processor=org.supercsv.cellprocessor.constraint.StrMinMax
context={lineNo=4, rowNo=4, columnNo=2, rowSource=[Shane, null]}
org.supercsv.exception.SuperCsvConstraintViolationException: null value
encountered processor=org.supercsv.cellprocessor.constraint.NotNull
context={lineNo=4, rowNo=4, columnNo=2, rowSource=[Shane, null]}
And the CSV output
name,age
Rick,43
Lori,
,
Carl,12
Notice how the invalid values were written as ""
because the SuppressException
processor returned null
for those values (not that you'd use the CSV output anyway, as it's not valid!).
来源:https://stackoverflow.com/questions/13646982/validate-every-field-in-a-single-pass-with-supercsv