问题
There is a MySQL table, Users on a Server. It has 28 rows and 1 million records (It may increase as well). I want to fetch all rows from this table, do some manipulation on them and then want to add them to MongoDB. I know that it will take lots of time to retrieve these records through simple 'Select * from Users' operation. I have been doing this in Java, JDBC. So, the options I got from my research is:
Option 1. Do batch processing : My plan was to get the total number of rows from the table, ie. select count(*) from users. Then, set a fetch size of say 1000 (setFetchSize(1000)). After that I was stuck. I did not know if I can write something like this:
Connection conn = DriverManager.getConnection(connectionUrl, userName,passWord);
Statement stmt =conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,java.sql.ResultSet.CONCUR_READ_ONLY);
String query="select * from users";
ResultSet resultSet=stmt.executeQuery(query);
My doubt was whether resultSet will have 1000 entries once I execute the query and should I repeatedly do the operation till all records are retrieved.
I dropped the plan because, I understand that for MySQL, ResultSet is fully populated at once and batching might not work. This stackoverflow discussion and MySQL documentation helped out.
Option 2. Do pagination: My idea is that I will set a Limit which will tell starting index for fetching and offset for fetching. May be, set the offset as 1000 and iterate over the index.
I read a suggested article link, but did not find any loop holes in approaching this problem using Limit.
Anybody who is kind enough and patient enough to read this long post, could you please share your valuable opinions on my thought process and correct me if there is something wrong or missing.
回答1:
Answering my own question based on the research I did:
Batching is not really effective for select queries, especially if you want to use the resultset of each query operation.
Pagination - Good if you want to improve the memory efficiency, not for improving speed of execution. Speed comes down as you fire multiple queries with Limit, as every time JDBC has to connect to MySQL.
来源:https://stackoverflow.com/questions/25790613/fetching-large-number-of-records-from-mysql-through-java