Can anyone help explain what is the transaction isolation level in google cloud BigQuery? It does not appear to be any documents on this. We know that in other databases, e.g. s
There is not much information about it, but in this migration guide we can find some comparisons between BigQuery
and Teradata
which can give us a clue.
As its said in the guide, BigQuery
uses optimistic concurrency control which uses no lock operations by assuming that multiple transactions can frequently complete without interfering with each other. When multiple DML transactions are performed in the same table at the same time, BigQuery
uses pessimistic concurrency control which uses lock operations. To ensure isolation, BigQuery
uses a snapshot isolation method.
BigQuery helps ensure optimistic concurrency control (first to commit wins) with snapshot isolation, in which a query reads the last committed data before the query starts. This approach guarantees the same level of consistency on a per-row, per-mutation basis and across rows within the same DML statement, yet avoids deadlocks. In the case of multiple DML updates against the same table, BigQuery switches to pessimistic concurrency control. Load jobs can run completely independently and append to tables. However, BigQuery does not yet provide an explicit transaction boundary or session.
Its also said in the guide that
Because there is no explicit transaction boundary in BigQuery yet, there is also no concept of an explicit rollback in BigQuery. The workarounds are table decorators or using FOR SYSTEM_TIME AS OF.
Given that we can assume that BigQuery
has its particular implementation of optimistic concurrency control
and pessimistic concurrency control
which makes it difficult for us to get more deep information.
I hope it helps