In terms of actual low level atomic instructions and memory fences (I assume they\'re used), how do you implement STM?
The part that\'s mysterious to me is that given s
Some papers can be really difficult to read but two which are very clear and concise are:
Transactional Locking II, Dave Dice, Ori Shalev, Nir Shavit which describes the "TL2" STM algorithm in terms of any language.
Deuce: Noninvasive Software Transactional Memory in Java, Guy Korland, Nir Shavit, Pascal Felber which explains a load time class transformer which transforms regular java classes into in-memory classes which have additional bytecode to do STM. This is relevant to the question as the paper explains how code without STM can be mechanically transformed into code which is doing STM in any OO language.
The Deuce framework lets you plugin the actual algorithm you wish to use; including the TL2 algorithm. As the Deuce framework is Java the following discussion uses java terminology but is only assuming that you are writing in an OO language.
Below will outline the TL2 approach. The papers have links to alternative approaches but studying one algorithm answers a lot of questions.
how do you implement STM? The part that's mysterious to me is that given some arbitrary chunk of code, you need a way to go back afterward and determine if the values used in each step were valid.
One short answer for how TL2 does STM is "bookkeeping" and then only using write locks at commit time. Read the paper for the fine detail but a board brush outline is as follows. Every property that you can use in the original code would have a getter and setter. In the transformed code there would also be a version number of the property and additional code added to the getter and setter. You need to record the version of every attribute you read within the transaction as the transaction "read-set". You can do this by having every getter add the version of the attribute seen into a threadlocal linkedlist. You also need to buffer the writes as the "write-set" in a threadlocal until you commit. Note that the getter methods need to check and return a threadlocal write-set value for a given field if you have one. That way you see your uncommitted writes in your reads but no other thread is going to see them until you commit.
At commit time you take write locks against every attribute you are about to write. Whilst you have the locks you double check that your read-set is still valid; that the attributes you read in your transaction have not been updated to a higher version by another transaction. If so then your business logic may not be valid as you can have inconsistent reads so you need to rollback the whole transaction. If the final checks pass then you commit by flushing your write-set, bump the versions of those attributes, release the write locks, and final clear both the write-set and read-set threadlocal lists.
The paper explains that the algorithm can abort early if it detects that an attribute being read has been written to since the tx started. The paper has some neat tricks to speed up read-only transactions. It even has a trick to work out which blocks are read-only and which are read-write. Anyone expressing an interest in such things really should enjoy the two papers.
The Deuce framework in the paper above shows how to change all your getters and setters by injecting new java bytecode into your classes at load time. Other languages could have a special compiler or preprocessor which perform the same mechanical transformation of normal code into STM enabled code. Specifically with Deuce your source code objects can have simple getter setter pairs but transformed classes at runtime have enriched getter setters which do the bookwork.
Transforming regular code into STM code (particularly at runtime) is interesting but if you need to actually write a STM enabled data structure you don't need any magic sauce. Instead just create a Ref
class with a get()
and a set(x)
and make every single relationship between your data structure objects made up of Ref
handles. In the get
and set
of your Ref
class you can do the threadlocal bookwork. Then you can implement a simple version of "TL2" or any other algorithm which works well for your data structures and your concurrency of read versus write.
This would also seem to suggest that just like any other 'locking' solution you want to keep your critical sections as small as possible (to decrease the probability of a conflict), am I right?
TL2 has a critical period in holding the write locks then doing final checks and writes which is easy to comprehend and optimise without any understanding of the application business logic. If you assign each number a unique property you can trivially avoid deadlock by taking locks in ascending order. It is important to note that all your business logic is done speculatively assuming that the commit checks will pass. You don't hold locks whilst you are doing arbitrary slow business logic. You can be doing multiple web service lookups or slow database calls and you won't take any locks until the commit. Clearly the professionals are going to tune the heck out of the generic critical section.
The paper makes it clear that the algorithm may be aborting more often that the specific business logic requires. The generic algorithm does not know whether specific dirty reads would not affect the actual write outcome. Handwritten logic which understands the actual business logic could know special cases when a rollback is not needed for a given sets of dirty reads. If however you have lots of code to write and an application where the likelihood of rollback is very low a generic mechanical STM approach may lead to less bugs and perform well.
Also, can STM simply detect "another thread entered this area while the computation was executing, therefore the computation is invalid" or can it actually detect whether clobbered values were used (and thus by luck sometimes two threads may execute the same critical section simultaneously without need for rollback)?
The TL2 approach as all about the data read or written not about the code which does it. It is what you get and set and which counts; and whether any other thread trod on your toes before you flush all the writes. All that is required of the code is that you have a begin()
, commit()
and rollback()
in the business logic to start, end and abort the transaction. Even that can be generated code. With Java you could mark your methods with the @Transactional annotation on methods then generate code which wrap your method invocations in a try/catch/finally that does the begin/commit/rollback as idiomic java. Deuce injects such logic at class load time.
Once again you don't need such magic sauce to begin/commit/rollback in your own STM enabled data structures. You can be explicit and put in all that right into your data structure logic code to create your own explicitly STM enabled classes in any OO language.