问题
I need to receive and save huge amount of data using spring data over hibernate. Our server allocated not enough RAM for persisting all entities at the same time. We will definitely get OutOfMemory error.
So we need to save data by batches it's obvious. Also we need to use @Transactional to be sure that all data persisted or non was persisted in case of even single error.
So, the question: does spring data during @Transactional method keep storing entities in RAM or entities which were flushed are accessible to garbage collector?
So, what is the best approach to process huge mount of data with spring data? Maybe spring data isn't right approach to solve problems like that.
回答1:
Does spring data during @Transactional method keep storing entities in RAM or entities which were flushed are accessible to garbage collector?
The entities will keep storing in RAM (i.e in entityManager
) until the transaction commit/rollback or the entityManager is cleared. That means the entities are only eligible for GC if the transaction commit/rollback or
entityManager.clear()
is called.
So, what is the best approach to process huge mount of data with spring data?
The general strategy to prevent OOM is to load and process the data batch by batch . At the end of each batch , you should flush and clear the entityManager
such that the entityManager
can release its managed entities for CG. The general code flow should be something like this:
@Component
public class BatchProcessor {
//Spring will ensure this entityManager is the same as the one that start transaction due to @Transactional
@PersistenceContext
private EntityManager em;
@Autowired
private FooRepository fooRepository;
@Transactional
public void startProcess(){
processBatch(1,100);
processBatch(101,200);
processBatch(201,300);
//blablabla
}
private void processBatch(int fromFooId , int toFooId){
List<Foo> foos = fooRepository.findFooIdBetween(fromFooId, toFooId);
for(Foo foo :foos){
//process a foo
}
/*****************************
The reason to flush is send the update SQL to DB .
Otherwise ,the update will lost if we clear the entity manager
afterward.
******************************/
em.flush();
em.clear();
}
}
Note that this practise is only for preventing OOM but not for achieving high performance. So if performance is not your concern , you can safely use this strategy.
来源:https://stackoverflow.com/questions/54147047/how-spring-data-clean-persited-entities-in-transactional-method