Lock and transaction in postgres that should block a query

问题

Let's assume in SQL window 1 I do:

-- query 1 
BEGIN TRANSACTION;
UPDATE post SET title = 'edited' WHERE id = 1;
-- note that there is no explicit commit

Then from another window (window 2) I do:

-- query 2
SELECT * FROM post WHERE id = 1;

I get:

1 | original title

Which is fine as the default isolation level is READ COMMITTED and because query 1 is never committed, the change it performs is not readable until I explicitly commit from window 1.

In fact if I, in window 1, do:

COMMIT TRANSACTION;

I can then see the change if I re-run query 2.

1 | edited

My question is:

Why is query 2 returning fine the first time I run it? I was expecting it to block as the transaction in window 1 was not committed yet and the lock placed on row with id = 1 was (should be) an unreleased exclusive one that should block a read like the one performed in window 2. All the rest makes sense to me but I was expecting the SELECT to get stuck until an explicit commit in window 1 was executed.

回答1:

The behaviour you describe is normal and expected in any transactional relational database.

If PostgreSQL showed you the value edited for the first SELECT it'd be wrong to do so - that's called a "dirty read", and is bad news in databases.

PostgreSQL would be allowed to wait at the SELECT until you committed or rolled back, but it isn't required to by the SQL standard, you haven't told it you want to wait, and it doesn't have to wait for any technical reason, so it returns the data you asked for immediately. After all, until it's committed, that update only kind-of exists - it still might or might not happen.

If PostgreSQL always waited here, then you'd quickly land up with a situation where only one connection could be doing anything with the database at a time. Not pretty for performance, and totally unnecessary the vast majority of the time.

If you want to wait for a concurrent UPDATE (or DELETE), you'd use SELECT ... FOR SHARE. (But be aware that this won't work for INSERT).

Details:

SELECT without a FOR UPDATE or FOR SHARE clause does not take any row level locks. So it sees whatever is the current committed row, and is not affected by any in-flight transactions that might be modifying that row. The concepts are explained in the MVCC section of the docs. The general idea is that PostgreSQL is copy-on-write, with versioning that allows it to return the correct copy based on what the transaction or statement could "see" at the time it started - what PostgreSQL calls a "snapshot".

In the default READ COMMITTED isolation snapshots are taken at the statement level, so if you SELECT a row, COMMIT a change to it from another transaction, and SELECT it again you'll see different values even within one transation. You can use SNAPSHOT isolation if you don't want to see changes committed after the transaction begins, or SERIALIZABLE isolation to add further protection against certain kinds of transaction inter-dependencies.

See the transaction isolation chapter in the documentation.

If you want a SELECT to wait for in-progress transactions to commit or rollback changes to rows being selected, you must use SELECT ... FOR SHARE. This will block on the lock taken by an UPDATE or DELETE until the transaction that took the lock rolls back or commits.

INSERT is different, though - the tuples just don't exist to other transactions until commit. The only way to wait for concurrent INSERTs is to take an EXCLUSIVE table-level lock, so you know nobody else is changing the table while you read it. Usually the need to do that means you have a design problem in the application though - your app should not care if there are uncommitted inserts still in flight.

See the explicit locking chapter of the documentation.

回答2:

In PostgreSQL's MVCC implementation, the principle is reading does not block writing and vice-versa. Per documentation:

The main advantage of using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading. PostgreSQL maintains this guarantee even when providing the strictest level of transaction isolation through the use of an innovative Serializable Snapshot Isolation (SSI) level.

Each transaction only sees (mostly) what has been committed before the transaction began.

That does not mean there'd be no locking. Not at all. For many operations various kinds of locks are acquired. And various strategies are applied to resolve possible conflicts.

来源：https://stackoverflow.com/questions/22945742/lock-and-transaction-in-postgres-that-should-block-a-query

标签

postgresql

concurrency

locking

mvcc