I need to create an index on a ~5M rows MySQL table. It is a production table, and I fear a complete block of everything if I run a CREATE INDEX statement...
Is the
As this blog post outlines, the InnoDB ALTER TABLE
mechanism has been completely redesigned for MySQL 5.6.
(For an exclusive overview of this topic, the MySQL documentation can provide an afternoon's worth of reading.)
To add an index to a table without a lock resulting on UPDATE
/ INSERT
, the following statement format can be used:
ALTER TABLE my_table ADD INDEX my_table__idx (my_column), ALGORITHM=INPLACE, LOCK=NONE;
MySQL 5.6 update (feb 2013): You can now perform read and write operations while an index is being created even with InnoDB tables - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html
In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.
and:
In MySQL 5.6, this feature becomes more general: you can read and write to tables while an index is being created, and many more kinds of ALTER TABLE operations can be performed without copying the table, without blocking DML operations, or both. Thus in MySQL 5.6 and higher, we typically refer to this set of features as online DDL rather than Fast Index Creation.
from http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_fast_index_creation
pt-online-schema-change is the way to go if you really want to make sure that the migration will not bring down the site.
As I wrote in the above comment, I have several experiences with pt-online-schema-change in production. We have our main table of 20M+ records and a master -> 2 read-only replication slaves. I've done at least a dozens of migrations with pt-online-schema-change from adding a new column, changing charset, to adding several indices. We serve tons of traffic during the migration time as well and we have not had any hiccup. Of course you'd have to test all the scripts very thoroughly before running on production.
I tried to batch up the changes into 1 script so that pt-online-schema-change only have to copy the data once. And be very careful with changing column name since you will loose your data. However, adding an index should be fine.
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-index-syntax-notes
In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.
From the answer above:
"If your using a version greater than 5.1 indices are created while the database is online. So not to worry you won't interrupt production system use."
This is ****FALSE**** (at least for MyISAM / InnoDB tables, which is what 99.999% of people out there use. Clustered Edition is different.)
Doing UPDATE operations on a table will BLOCK while the index is being created. MySQL is really, really stupid about this (and a few other things).
Test Script:
(
for n in {1..50}; do
#(time mysql -uroot -e 'select * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
(time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'
My Server (InnoDB):
Server version: 5.5.25a Source distribution
Output (notice how the 6th operation blocks for the ~400ms it takes to finish the index update):
1 real 0m0.009s
2 real 0m0.009s
3 real 0m0.009s
4 real 0m0.012s
5 real 0m0.009s
Index Update - START
Index Update - FINISH
6 real 0m0.388s
7 real 0m0.009s
8 real 0m0.009s
9 real 0m0.009s
10 real 0m0.009s
11 real 0m0.009s
Vs read operations which don't block (swap the line comment in the script):
1 real 0m0.010s
2 real 0m0.009s
3 real 0m0.009s
4 real 0m0.010s
5 real 0m0.009s
Index Update - START
6 real 0m0.010s
7 real 0m0.010s
8 real 0m0.011s
9 real 0m0.010s
...
41 real 0m0.009s
42 real 0m0.010s
43 real 0m0.009s
Index Update - FINISH
44 real 0m0.012s
45 real 0m0.009s
46 real 0m0.009s
47 real 0m0.010s
48 real 0m0.009s
Thusfar, there's only one method I know of to update a MySql schema and not suffer an availability outage. Circular masters:
An easy way to update schema this isn't. Workable in a serious production environment; yes, it is. Please, please, please, if there is an easier way to add an index to a MySQL table without blocking writes, let me know.
Googling lead me to this article which describes a similar technique. Even better, they advise drinking at the same point in the proceedure (Note that I wrote my answer before reading the article)!
The article I linked above talks about a tool, pt-online-schema-change, that works as follows:
I've never tried the tool myself. YMMV
I'm currently using MySQL through Amazon's RDS. It's a really nifty service that wraps up and manages MySQL, letting you add new read replicas with a single button and transparently upgrade the database across hardware SKU's. It's really convenient. You don't get SUPER access to the database, so you can't screw with replication directly (is this a blessing or curse?). However, you can use Read Replica Promotion to make your schema changes on a read-only slave, then promote that slave to become your new master. Exactly the same trick as I described above, just vastly easier to execute. They still don't do much to help you with the cut-over. You have to reconfigure and restart your app.