my problem is as follows:
The CSV file is downloaded at given intervals from server.
File has to be parsed (there are unnecessary spaces tha
I think the best way in your case is to use Statement batching here is an example :
sqlConnection.setAutoCommit(false);//<<------------
try {
preparedStatement = sqlConnection.prepareStatement(sqlString);
while ((csvLine = bufferedReader.readLine()) != null) {
String[] splitLine = csvLine.split(";");
preparedStatement.setBigDecimal(1, new BigDecimal(splitLine[4].trim()).setScale(2, RoundingMode.CEILING));
preparedStatement.setBigDecimal(2, new BigDecimal(splitLine[5].trim()).setScale(2, RoundingMode.CEILING));
preparedStatement.setInt(3, Integer.parseInt(splitLine[6].trim()));
preparedStatement.setString(4, splitLine[2].trim());
preparedStatement.setString(5, splitLine[8].trim());
preparedStatement.addBatch();//<<-----------add a batch
}
//execute your multiple statement as one
statement.executeBatch();//<<------------
sqlConnection.commit();//<<--------------
}
EDIT
Like @Mick Mnemonic mention in comment :
You could try if splitting into smaller batches of say 500 rows makes any difference
So instead to execute your batch in one shot you can split your batch in small batches for example :
sqlConnection.setAutoCommit(false);
try {
int nbrBatch = 500;
int count = 0;
preparedStatement = sqlConnection.prepareStatement(sqlString);
while ((csvLine = bufferedReader.readLine()) != null) {
//Your code here
preparedStatement.addBatch();
if (count % nbrBatch == 0) {
statement.executeBatch();
}
count++;
}
//rest of your batch not executed
statement.executeBatch();
sqlConnection.commit();
}
There is a much faster and much better way to import your data without a single line of Java code. That is with LOAD DATA INFILE
note that LOAD DATA can do some amount of preprocessing for your CSV. However it is not adequate in some situation. Then you would have to get help from java but java shouldn't be doing the insert.
Your java code should preprocess the CSV and produce another CSV which can be passed to LOAD DATA. 70000 records is a breeze.
LOAD DATA does not support the ON DUPLICATE KEY UPDATE
syntax. But it does have the REPLACE syntax (which is slightly slower than duplicate key update, but it would still be at least an order of magnitude faster than what java or any other programming language can give you).
If doing an actually update is a matter of importance. LOAD DATA into a temporary table and then to
INSERT INTO mytable SELECT * FROM temp_table ON DUPLICATE KEY ....