Do MVCC databases see inserted rows in mid-transaction?

问题

Does MVCC database isolation mode allow in-progress transactions to see rows inserted (and committed) by other transactions?

For example, given:

Table names[id BIGINT NOT NULL, name VARCHAR(30), PRIMARY KEY(id), UNIQUE(name)]
Transactions T1 and T2,

T1: open transaction
T2: open transaction
T1: select * from names;
    insert into names(name) values("John");
    // do something
    commit;
T2: select * from names;
    insert into names values("John");
    // do something
    commit;

When does T2 first become aware of the new row? At select time? At insert time? Or at commit time?

回答1:

Answer really depends on server implementation and whether unique constraint is marked deferrable or not.

I have not tested it for other databases, but in PostgreSQL (as one of most prominent open-source MVCC databases) in my test replicating your setup T2 fails on INSERT. However, T2 cannot see any changes made by T1 by using SELECT.

I have executed following statements almost at the same time in 2 separate SQL connections:

BEGIN;
SELECT * FROM names;
SELECT pg_sleep(10);
INSERT INTO names values('john');
SELECT pg_sleep(10);
COMMIT;

One succeeded, but another failed after 10 seconds with:

ERROR:  duplicate key value violates unique constraint "names_pkey"
DETAIL:  Key (name)=(john) already exists.

This makes sense, because documentation says:

If a conflicting row has been inserted by an as-yet-uncommitted transaction, the would-be inserter must wait to see if that transaction commits. If it rolls back then there is no conflict. If it commits without deleting the conflicting row again, there is a uniqueness violation.

If, however, unique constraint was marked deferrable, uniqueness will be checked at COMMIT time:

If the unique constraint is deferrable, there is additional complexity: we need to be able to insert an index entry for a new row, but defer any uniqueness-violation error until end of statement or even later.

回答2:

No, it shows you a snapshot of the database. No new rows (phantom reads) will show up. No matter what happens, the snapshot stays the same.

This is usually implemented by marking inserted rows with a time stamp and, when reading, silently discarding rows that have been inserted newer than the start of the transaction.

T2, in your example, never becomes aware of the new rows because after the commit the old transaction is finished. Only a new transaction would see the rows inserted (in this case, "T3").

回答3:

This depends on the transaction isolation level; the SQL standard actually specifies 4 levels for MVCC databases. They are (in order of increasing strictness):

Read uncommitted - essentially, no isolation at all. This is not an interesting case to dwell on, but T2 will obviously see the new row at SELECT time in this case.
Read committed - cannot read uncommitted updates. This is the Postgres default. In this case T2 will see the new row at SELECT time since T1 has already committed.
Repeatable read - all reads will never see other concurrent updates[1], even if committed. This is the Mysql+InnoDB default. In this case T2 should fail at COMMIT time (with a serialization error), although the engine can know ahead of time the INSERT cannot succeed and fail early at INSERT time, based on whether the uniqueness constraint is deferred to commit time or not.
Serializable - like repeatable read, but transactions have to logically behave as if they executed one after the other. Same behavior in this case as repeatable read.

The interesting observation here is that at Postgres's default transaction isolation level, T2 can see T1's changes after it commits. This is probably counter-intuitive to most people.

(Note that mvp's reproduction appears to fail at INSERT time even though I said read committed should fail at COMMIT time; this is because he has interleaved T2's SELECT statement before T1's COMMIT statement, which is not the interleaving presented in the question.)

[1] Technically the standard allows phantom reads at this isolation level, in which recently committed inserts can be seen, but no MVCC implementation I know of actually allows this in practice.

来源：https://stackoverflow.com/questions/13674104/do-mvcc-databases-see-inserted-rows-in-mid-transaction

标签

sql

mvcc