How to debug Lock wait timeout exceeded on MySQL?

前端 未结 11 835
我寻月下人不归
我寻月下人不归 2020-11-22 11:46

In my production error logs I occasionally see:

SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction

相关标签:
11条回答
  • 2020-11-22 12:22

    Take a look at the man page of the pt-deadlock-logger utility:

    brew install percona-toolkit
    pt-deadlock-logger --ask-pass server_name
    

    It extracts information from the engine innodb status mentioned above and also it can be used to create a daemon which runs every 30 seconds.

    0 讨论(0)
  • 2020-11-22 12:23

    If you're using JDBC, then you have the option
    includeInnodbStatusInDeadlockExceptions=true

    https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-configuration-properties.html

    0 讨论(0)
  • 2020-11-22 12:27

    For the record, the lock wait timeout exception happens also when there is a deadlock and MySQL cannot detect it, so it just times out. Another reason might be an extremely long running query, which is easier to solve/repair, however, and I will not describe this case here.

    MySQL is usually able to deal with deadlocks if they are constructed "properly" within two transactions. MySQL then just kills/rollback the one transaction that owns fewer locks (is less important as it will impact less rows) and lets the other one finish.

    Now, let's suppose there are two processes A and B and 3 transactions:

    Process A Transaction 1: Locks X
    Process B Transaction 2: Locks Y
    Process A Transaction 3: Needs Y => Waits for Y
    Process B Transaction 2: Needs X => Waits for X
    Process A Transaction 1: Waits for Transaction 3 to finish
    
    (see the last two paragraph below to specify the terms in more detail)
    
    => deadlock 
    

    This is a very unfortunate setup because MySQL cannot see there is a deadlock (spanned within 3 transactions). So what MySQL does is ... nothing! It just waits, since it does not know what to do. It waits until the first acquired lock exceeds the timeout (Process A Transaction 1: Locks X), then this will unblock the Lock X, which unlocks Transaction 2 etc.

    The art is to find out what (which query) causes the first lock (Lock X). You will be able to see easily (show engine innodb status) that Transaction 3 waits for Transaction 2, but you will not see which transaction Transaction 2 is waiting for (Transaction 1). MySQL will not print any locks or query associated with Transaction 1. The only hint will be that at the very bottom of the transaction list (of the show engine innodb status printout), you will see Transaction 1 apparently doing nothing (but in fact waiting for Transaction 3 to finish).

    The technique for how to find which SQL query causes the lock (Lock X) to be granted for a given transaction that is waiting is described here Tracking MySQL query history in long running transactions

    If you are wondering what the process and the transaction is exactly in the example. The process is a PHP process. Transaction is a transaction as defined by innodb-trx-table. In my case, I had two PHP processes, in each I started a transaction manually. The interesting part was that even though I started one transaction in a process, MySQL used internally in fact two separate transactions (I don't have a clue why, maybe some MySQL dev can explain).

    MySQL is managing its own transactions internally and decided (in my case) to use two transactions to handle all the SQL requests coming from the PHP process (Process A). The statement that Transaction 1 is waiting for Transaction 3 to finish is an internal MySQL thing. MySQL "knew" the Transaction 1 and Transaction 3 were actually instantiated as part of one "transaction" request (from Process A). Now the whole "transaction" was blocked because Transaction 3 (a subpart of "transaction") was blocked. Because "transaction" was not able to finish the Transaction 1 (also a subpart of the "transaction") was marked as not finished as well. This is what I meant by "Transaction 1 waits for Transaction 3 to finish".

    0 讨论(0)
  • 2020-11-22 12:29

    As someone mentioned in one of the many SO threads concerning this problem: Sometimes the process that has locked the table shows up as sleeping in the processlist! I was tearing my hair out until I killed all the sleeping threads that were open in the database in question (none were active at the time). That finally unlocked the table and let the update query run.

    The commenter said something akin to "Sometimes a MySQL thread locks a table, then sleeps while it waits for something non-MySQL-related to happen."

    After re-re-reviewing the show engine innodb status log (once I'd tracked down the client responsible for the lock), I noticed the stuck thread in question was listed at the very bottom of the transaction list, beneath the active queries that were about to error out because of the frozen lock:

    ------------------
    ---TRANSACTION 2744943820, ACTIVE 1154 sec(!!)
    2 lock struct(s), heap size 376, 2 row lock(s), undo log entries 1
    MySQL thread id 276558, OS thread handle 0x7f93762e7710, query id 59264109 [ip] [database] cleaning up
    Trx read view will not see trx with id >= 2744943821, sees < 2744943821
    

    (unsure if the "Trx read view" message is related to the frozen lock, but unlike the other active transactions, this one does not show up with the query that was issued and instead claims the transaction is "cleaning up," yet has multiple row locks)

    The moral of the story is that a transaction can be active even though the thread is sleeping.

    0 讨论(0)
  • 2020-11-22 12:33

    Activate MySQL general.log (disk intensive) and use mysql_analyse_general_log.pl to extract long running transactions, for example with :

    --min-duration=your innodb_lock_wait_timeout value

    Disable general.log after that.

    0 讨论(0)
提交回复
热议问题