问题
I have a lot of rows in a database and it must be processed, but I can't retrieve all the data to the memory due to memory limitations.
At the moment, I using LIMIT and OFFSET to retrieve the data to get the data in some especified interval.
I want to know if the is the faster way or have another method to getting all the data from a table in database. None filter will be aplied, all the rows will be processed.
回答1:
SELECT * FROM table ORDER BY column
There's no reason to be sucking the entire table in to RAM. Simply open a cursor and start reading. You can play games with fetch sizes and what not, but the DB will happily keep its place while you process your rows.
Addenda:
Ok, if you're using Java then I have a good idea what your problem is.
First, just by using Java, you're using a cursor. That's basically what a ResultSet is in Java. Some ResultSets are more flexible than others, but 99% of them are simple, forward only ResultSets that you call 'next' upon to get each row.
Now as to your problem.
The problem is specifically with the Postgres JDBC driver. I don't know why they do this, perhaps it's spec, perhaps it's something else, but regardless, Postgres has the curious characteristic that if your Connection has autoCommit set to true, then Postgres decides to suck in the entire result set on either the execute method or the first next method. Not really important as to where, only that if you have a gazillion rows, you get a nice OOM exception. Not helpful.
This can easily be exactly what you're seeing, and I appreciate how it can be quite frustrating and confusing.
Most Connection default to autoCommit = true. Instead, simply set autoCommit to false.
Connection con = ...get Connection...
con.setAutoCommit(false);
PreparedStatement ps = con.prepareStatement("SELECT * FROM table ORDER BY columm");
ResultSet rs = ps.executeQuery();
while(rs.next()) {
String col1 = rs.getString(1);
...and away you go here...
}
rs.close();
ps.close();
con.close();
Note the distinct lack of exception handling, left as an exercise for the reader.
If you want more control over how many rows are fetched at a time into memory, you can use:
ps.setFetchSize(numberOfRowsToFetch);
Playing around with that might improve your performance.
Make sure you have an appropriate index on the column you use in the ORDER BY if you care about sequencing at all.
回答2:
Since its clear your using Java based on your comments:
If you are using JDBC you will want to use: http://download.oracle.com/javase/1.5.0/docs/api/java/sql/ResultSet.html
If you are using Hibernate it gets trickier: http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html
来源:https://stackoverflow.com/questions/6751690/what-is-the-fastest-way-to-retrieve-sequential-data-from-database