问题
I have started using spring from last few months and I have a question on transactions. I have a java method inside my spring batch job which first does a select operation to get first 100 rows with status as 'NOT COMPLETED' and does a update on the selected rows to change the status to 'IN PROGRESS'. Since I'm processing around 10 million records, I want to run multiple instances of my batch job and each instance has multiple threads. For a single instance, to make sure two threads are not fetching the same set of records, I have made my method as synchonized. But if I run multiple instances of my batch job (multiple JVMs), there is high probability that same set of records might be fetched by both the instances even if I use "optimistic" or "pesimistic lock" or "select for update" since we cannot lock records during selection. Below is the example shown. Transaction 1 has fetched 100 records and meanwhile Transaction2 also fetched 100 records but if I enable locking transaction 2 waits until transaction 1 is updated and committed. But Transaction 2 again does the same update.
Is there any way in spring to make transaction 2's select operation to wait until transaction 1's select is completed ?
Transaction1 Transaction2
fetch 100 records
fetch 100 records
update 100 records
commit
update 100 records
commit
@Transactional
public synchronized List<Student> processStudentRecords(){
List<Student> students = getNotCompletedRecords();
if(null != students && students.size() > 0){
updateStatusToInProgress(students);
}
return student;
}
Note: I cannot perform update first and then select. I would appreciate if any alternative approach is suggested ?
回答1:
Transaction synchronization should be left to the database server and not managed at the application level. From the database server point of view, no matter how many JVMs (threads) you have, those are concurrent database clients asking for read/write operations. You should not bother yourself with such concerns.
What you should do though is try to minimize contention as much as possible in the design of your solution, for example, by using the (remote) partitioning technique.
if I run multiple instances of my batch job (multiple JVMs), there is high probability that same set of records might be fetched by both the instances even if I use "optimistic" or "pesimistic lock" or "select for update" since we cannot lock records during selection
Partitioning data will by design remove all these problems. If you give each instance a set of data to work on, there is no chance that a worker would select the same of records of another worker. Michael gave a detailed example in this answer: https://stackoverflow.com/a/54889092/5019386.
(Logical) Partitioning however will not solve the contention problem since all workers would read/write from/to the same table, but that's the nature of the problem you are trying to solve. What I'm saying is that you don't need to start locking/unlocking the table in your design, leave this to the database. Some database severs like Oracle can write data of the same table to different partitions on disk to optimize concurrent access (which might help if you use partitioning), but again that's Oracle's business, not Spring's (or any other framework) business.
Not everybody can afford Oracle so I would look for a solution at the conceptual level. I have successfully used the following solution ("Pseudo" physical partitioning) to a problem similar to yours:
- Step 1 (in serial): copy/partition unprocessed data to temporary tables (in serial)
- Step 2 (in parallel): run multiple workers on these tables instead of the source table with millions of rows.
- Step 3 (in serial): copy/update processed data back to the original table
Step 2 removes the contention problem. Usually, the cost of (Step 1 + Step 3) is neglectable compared to Step 2 (even more neglectable if Step 2 is done in serial). This works well if the processing is the bottleneck.
Hope this helps.
来源:https://stackoverflow.com/questions/54908815/how-to-lock-on-select-and-release-lock-after-update-is-committed-using-spring