Why does TRANSACTION / COMMIT improve performance so much with PHP/MySQL (InnoDB)?

别来无恙 提交于 2021-02-05 17:59:03

问题


I've been working with importing large CSV files of data; usually less than 100,000 records. I'm working with PHP and MySQL (InnoDB tables). I needed to use PHP to transform some fields and do some text processing prior to the MySQL INSERTs (part of process_note_data() in code below). MySQL's LOAD DATA was not feasible, so please do not suggest it.

I recently tried to improve the speed of this process by using MySQL transactions using START TRANSACTION and COMMIT. The performance increase was surprising. Processing time(s) dropped by a factor of 20. So, a 20 minute process only took about 1 minute.

QUESTIONS.

1.) Does anyone understand why there was such performance increase (20 mins to 1 min)?

2.) Should I be concerned about how big the transaction may get with 100,000 records?

3.) Should I be concerned with a large number of inserts and/or updates in the transaction?

/*
 * Customer Notes Data:
 * Rows are either a meeting, call or note!
 */
$row = 1;
$data = array();
$fields = array();
$line = '';

$db->query('SET autocommit=0;');
$db->query('START TRANSACTION;');

if (($handle = fopen("modules/".$currentModule."/Data/customernote.csv", "r")) !== FALSE) {
  while (($data = fgetcsv($handle, 4096, ',', '"')) !== FALSE && $row < 999000) {
    //Row 1 - CSV header row with field names
    if ($row == 1) {
      $csv_fields = $data;
    } elseif ($row > 1) {
      $fields = $this->process_note_data($data, $csv_fields, $row);
    }
    $row++;
  } // end while
  fclose($handle);
}

$db->query('COMMIT;');
$db->query('SET autocommit=1;');

Note: The text/field processing is done in the call to $this->process_note_data() which then calls another helper class that has the INSERT statement code. I didn't have enough room to include all of the code. $db->query() is a typical database object for MySQL queries.


回答1:


  1. Please check this link:

    https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html

    InnoDB must flush the log to disk at each transaction commit if that transaction made modifications to the database. When each change is followed by a commit (as with the default autocommit setting), the I/O throughput of the storage device puts a cap on the number of potential operations per second.

  2. Big transactions may affect performance during commit (check above)

  3. Only in case of rollback, however it may be optimized using some settings (check the link)




回答2:


My own little test in .Net (4 fields pr. records):

INSERT 1 record, no transaction:60 ms

INSERT 1 record, using transaction:158 ms

INSERT 200 records using transactions, commit after each record:17778 ms

INSERT 200 records using no transactions:4940 ms

INSERT 200 records using transactions, only commit after last record:4552 ms

INSERT 1000 records using transactions, only commit after last record:21795 ms

Client in Denmark, server in Belgium (Google cloud f1-micro).

I meant to put this in a comment but the formatting is not good....so here is my apology in advance ;-)



来源:https://stackoverflow.com/questions/14675147/why-does-transaction-commit-improve-performance-so-much-with-php-mysql-innodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!