问题
I have started using Spring framework from last few months. I have a question on how the transaction manager works in the below scenario.
Scenario: I'm working on a spring batch in which the ItemReader calls the below method multiple times. This method fetches the list of records from the 'STUDENTS' table which are in status 'NOT COMPLETED' and updates the status of these records to 'IN PROGRESS'. I'm processing 10 million records hence I'm planning to execute my batch process using multiple threads and multiple JVMs.
Solution Implemented so far: I have made this method synchronized to make sure that only single thread fetches the records at a given time so that no two threads will try to fetch the same 'NOT COMPLETED' records. Also added @Transactional so that if any issue occurs in this method, spring will rollback the changes.
Problem: How does the transaction management works with multiple JVMs accessing one database? If I'm running 2-3 instances of my application then how to make sure that these instances are not trying to fetch same records whith status 'NOT COMPLETED'? Do spring has this feature?
@Transactional
public synchronized List<Student> processStudentRecords(){
List<Student> students = getNotCompletedRecords();
if(null != students && students.size() > 0){
updateStatusToInProgress(students);
}
return student;
}
回答1:
To address your questions specifically:
- Don't use
@Transactional
on components used by Spring Batch. Spring Batch manages the transactions and use of that annotation will cause issues. - To use a process indicator flag (as you are proposing), you need to do the update before the select. The select should query only the records available for processing that have been tagged.
The above, however, does not answer the issue of multiple JVMs. I'm assuming that you are using remote partitioning of some kind in your processing across multiple JVMs. If that is the case, your update query would be something like UPDATE STUDENTS SET FLAG = 'NOT COMPLETE' WHERE ID > ? AND FLAG IS NULL LIMIT 100
where ID is the start of the range of the partition and 100 is your chunk size.
Once you have tagged your rows, your ItemReader
can query via something like SELECT * FROM STUDENT WHERE ID > ? AND FLAG = 'NOT COMPLETE'
where the ID is the start of the range of your partition.
The above technique allows for you to scale to multiple JVMs while preserving things like restartability.
来源:https://stackoverflow.com/questions/54888037/how-to-enable-transaction-management-in-spring-application-running-on-multiple-j