Only partialially parse a CSV file with OpenCSV

问题

I have a CSV file which I want to parse in Java with OpenCSV's csvreader.

To do so I have created a bean object to which the information is mapped. Mine is a bit long so here's an example I got from a tutorial :

package net.viralpatel.java;

public class Country {
    private String countryName;
    private String capital;

    public String getCountryName() {
    return countryName;
    }

    public void setCountryName(String countryName) {
        this.countryName = countryName;
    }

    public String getCapital() {
        return capital;
    }

    public void setCapital(String capital) {
        this.capital = capital;
    }
}

The code I used to parse my CSV file and map the information to the bean resembles this one :

ColumnPositionMappingStrategy strat = new ColumnPositionMappingStrategy();
strat.setType(Country.class);
String[] columns = new String[] {"countryName", "capital"};
strat.setColumnMapping(columns);

CsvToBean csv = new CsvToBean();

String csvFilename = "C:\\sample.csv";
CSVReader csvReader = new CSVReader(new FileReader(csvFilename));

List list = csv.parse(strat, csvReader);

The problem is that my CSV contains not only raw data but also column titles and other data. For the column titles, I solved the issue by only reading my file from a certain line with :

CSVReader csvReader = new CSVReader(new FileReader(csvFilename), ';', '\'', 1);

(1 being the line from which the reading starts)

The other data is mostly strings in (for example) integer columns at the end of the file.

For example i have a "Max Speed" column with integer information, just next to a "Distance" column with integer information too. But at the end of the "Distance" column there is the total distance, so the String "total:" is in the "Max Speed" column right next to it.

What can I do to ensure that the reader ignores this last lines and only reads the raw information above?

PS : the CSV files I read have different lengths. So saying "stop reading after line X" won't do the trick. On the other hand the "appendix" lines are always the same. So saying "Stop reading two lines before the end of the file" should work.

Thank you very much for your help.

回答1:

You can always fall to lower level and check raw string array before map it into bean like this:

ColumnPositionMappingStrategy<Country> strat = new ColumnPositionMappingStrategy<Country>();
    strat.setType(Country.class);
    String[] columns = new String[] {"countryName", "capital"};
    strat.setColumnMapping(columns);

    PublicProcessLineCsvToBean<Country> csv = new PublicProcessLineCsvToBean<Country>();

    String csvFilename = "C:\\sample.csv";
    CSVReader csvReader = new CSVReader(new FileReader(csvFilename));
    List<Country> list = new ArrayList<Country>();

    String [] row = csvReader.readNext(); //skip header
    if(row == null) throw new RuntimeException("File is empty");
    row = csvReader.readNext();
    String [] nextRow = csvReader.readNext();
    while(row != null) {
        if(nextRow == null) break; //check what 'row' is last
        if("Total:".equalsIgnoreCase(row[1])) break; //check column for special strings

        list.add(csv.processLine(strat, row));

        row = nextRow;
        nextRow = csvReader.readNext();
    }

and to make processLine public:

public static class PublicProcessLineCsvToBean<T> extends CsvToBean<T> {

        @Override
        public T processLine(MappingStrategy<T> mapper, String[] line) throws IllegalAccessException, InvocationTargetException, InstantiationException, IntrospectionException {
            return super.processLine(mapper, line);
        }
    }

回答2:

If you are using the newer versions of opencsv then inject a CsvToBeanFilter into you CSVtoBean class. The opencsv javadoc gives an excellent example of how to create a filter. For your example you would just create a filter whose allowLine method would return false if the Max Speed is null, empty or "total:"

来源：https://stackoverflow.com/questions/35695348/only-partialially-parse-a-csv-file-with-opencsv

标签

java

parsing

csv

partial

opencsv