I\'m wondering about an update I am making to a large table, and whether I need to worry about locks.
I have a table looking like this:
CREATE TABLE
You are missing a couple of things.
First, PostgreSQL does not offer a LIMIT
option for update. See the docs for UPDATE.
Second, note that ROW EXCLUSIVE
does not conflict with itself, it conflicts with SHARE ROW EXCLUSIVE
which is different. So, your UPDATE
statements can safely run concurrently from multiple workers. You still will want your update times to be low. However, you already have a built-in way to tune that by lowering your batchSize
if you run into problems.
UPDATE
locks the row, so you do not need to lock it first. If you try to UPDATE
overlapping sets of rows simultaneously, the second UPDATE
will wait for the first's transaction to commit or roll back.
The big problem with your approach - other than the fact that UPDATE
doesn't have a LIMIT
clause - is that multiple workers will all try to grab the same rows. Here's what happens:
... and repeat!
You need to either:
As for LIMIT
- you could use WHERE id IN (SELECT t.id FROM thetable t LIMIT 200 ORDER BY id)
- but you'd have the same problem with both workers choosing the same set of rows to update.