JPA: what is the proper pattern for iterating over large result sets?

后端未结

关注

 15  2415

Let\'s say I have a table with millions of rows. Using JPA, what\'s the proper way to iterate over a query against that table, such that I don\'t have all an in-memo

相关标签:

15条回答

忘了有多久

2020-11-27 10:22

I tried the answers presented here, but JBoss 5.1 + MySQL Connector/J 5.1.15 + Hibernate 3.3.2 didn't work with those. We've just migrated from JBoss 4.x to JBoss 5.1, so we've stuck with it for now, and thus the latest Hibernate we can use is 3.3.2.

Adding couple of extra parameters did the job, and code like this runs without OOMEs:

StatelessSession session = ((Session) entityManager.getDelegate()).getSessionFactory().openStatelessSession(); Query query = session .createQuery("SELECT a FROM Address a WHERE .... ORDER BY a.id"); query.setFetchSize(Integer.valueOf(1000)); query.setReadOnly(true); query.setLockMode("a", LockMode.NONE); ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY); while (results.next()) { Address addr = (Address) results.get(0); // Do stuff } results.close(); session.close();

The crucial lines are the query parameters between createQuery and scroll. Without them the "scroll" call tries to load everything into memory and either never finishes or runs to OutOfMemoryError.

0 讨论(0)

发布评论:

提交评论

加载中...

Happy的楠姐

2020-11-27 10:24

You can't really do this in straight JPA, however Hibernate has support for stateless sessions and scrollable result sets.

We routinely process billions of rows with its help.

Here is a link to documentation: http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html#batch-statelesssession

0 讨论(0)

发布评论:

提交评论

加载中...

不知归路

2020-11-27 10:29

Here's a simple, straight JPA example (in Kotlin) that shows how you can paginate over an arbitrarily large result set, reading chunks of 100 items at a time, without using a cursor (each cursor consumes resources on the database). It uses keyset pagination.

See https://use-the-index-luke.com/no-offset for the concept of keyset pagination, and https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/ for a comparison of different ways to paginate along with their drawbacks.

/* create table my_table( id int primary key, -- index will be created my_column varchar ) */ fun keysetPaginationExample() { var lastId = Integer.MIN_VALUE do { val someItems = myRepository.findTop100ByMyTableIdAfterOrderByMyTableId(lastId) if (someItems.isEmpty()) break lastId = someItems.last().myTableId for (item in someItems) { process(item) } } while (true) }

0 讨论(0)

发布评论:

提交评论

加载中...

情深已故

2020-11-27 10:31

You can use another "trick". Load only collection of identifiers of the entities you're interested in. Say identifier is of type long=8bytes, then 10^6 a list of such identifiers makes around 8Mb. If it is a batch process (one instance at a time), then it's bearable. Then just iterate and do the job.

One another remark - you should anyway do this in chunks - especially if you modify records, otherwise rollback segment in database will grow.

When it comes to set firstResult/maxRows strategy - it will be VERY VERY slow for results far from the top.

Also take into consideration that the database is probably operating in read commited isolation, so to avoid phantom reads load identifiers and then load entities one by one (or 10 by 10 or whatever).

0 讨论(0)

发布评论:

提交评论

加载中...

抹茶落季

2020-11-27 10:31

Use Pagination Concept for retrieving result

0 讨论(0)

发布评论:

提交评论

加载中...

谎友^

2020-11-27 10:34

An Example with JPA and NativeQuery fetching everytime the size Elements using offsets

public List<X> getXByFetching(int fetchSize) { int totalX = getTotalRows(Entity); List<X> result = new ArrayList<>(); for (int offset = 0; offset < totalX; offset = offset + fetchSize) { EntityManager entityManager = getEntityManager(); String sql = getSqlSelect(Entity) + " OFFSET " + offset + " ROWS"; Query query = entityManager.createNativeQuery(sql, X.class); query.setMaxResults(fetchSize); result.addAll(query.getResultList()); entityManager.flush(); entityManager.clear(); return result; }

0 讨论(0)

发布评论:

提交评论

加载中...

上一页 1 2 3 下一页

验证码

看不清?

提交回复