I understand the differences between optimistic and pessimistic locking. Now could someone explain to me when I would use either one in general?
And does the answer
Optimistic Locking is a strategy where you read a record, take note of a version number (other methods to do this involve dates, timestamps or checksums/hashes) and check that the version hasn't changed before you write the record back. When you write the record back you filter the update on the version to make sure it's atomic. (i.e. hasn't been updated between when you check the version and write the record to the disk) and update the version in one hit.
If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.
This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.
Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.
In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID. This is how distributed transactions using two-phase commit protocols (such as XA or COM+ Transactions) work.
One use case for optimistic locking is to have your application use the database to allow one of your threads / hosts to 'claim' a task. This is a technique that has come in handy for me on a regular basis.
The best example I can think of is for a task queue implemented using a database, with multiple threads claiming tasks concurrently. If a task has status 'Available', 'Claimed', 'Completed', a db query can say something like "Set status='Claimed' where status='Available'. If multiple threads try to change the status in this way, all but the first thread will fail because of dirty data.
Note that this is a use case involving only optimistic locking. So as an alternative to saying "Optimistic locking is used when you don't expect many collisions", it can also be used where you expect collisions but want exactly one transaction to succeed.
Optimistic assumes that nothing's going to change while you're reading it.
Pessimistic assumes that something will and so locks it.
If it's not essential that the data is perfectly read use optimistic. You might get the odd 'dirty' read - but it's far less likely to result in deadlocks and the like.
Most web applications are fine with dirty reads - on the rare occasion the data doesn't exactly tally the next reload does.
For exact data operations (like in many financial transactions) use pessimistic. It's essential that the data is accurately read, with no un-shown changes - the extra locking overhead is worth it.
Oh, and Microsoft SQL server defaults to page locking - basically the row you're reading and a few either side. Row locking is more accurate but much slower. It's often worth setting your transactions to read-committed or no-lock to avoid deadlocks while reading.
In most cases, optimistic locking is more efficient and offers higher performance. When choosing between pessimistic and optimistic locking, consider the following:
Pessimistic locking is useful if there are a lot of updates and relatively high chances of users trying to update data at the same time. For example, if each operation can update a large number of records at a time (the bank might add interest earnings to every account at the end of each month), and two applications are running such operations at the same time, they will have conflicts.
Pessimistic locking is also more appropriate in applications that contain small tables that are frequently updated. In the case of these so-called hotspots, conflicts are so probable that optimistic locking wastes effort in rolling back conflicting transactions.
Optimistic locking is useful if the possibility for conflicts is very low – there are many records but relatively few users, or very few updates and mostly read-type operations.