Sometime soon we will need to make schema changes to our production database. We need to minimize downtime for this effort, however, the ALTER TABLE statements are going to
Unfortunately, this is not always as simple as staticsan leads on in his answer. Creating the new table while online and moving the data over is easy enough, and doing the cleanup while in maintenance mode is also do-able enough, however, the Mysql RENAME operation automatically manipulates any foreign key references to your old table. What this means is that any foreign key references to the original table will still point to whatever you rename the table to.
So, if you have any foreign key references to the table you're trying to alter you're stuck either altering those tables to replace the reference to your new table, or worse if that table is large you have to repeat the process with large table number two.
Another approach that has worked for us in the past has been to juggle a set of Mysql replicas handling the alter. I'm not the best person to speak to the process, but it basically consists of breaking replication to one slave, running the patch on that instance, turning replication back on once the alter table is completed so that it catches up on replication. Once the replication catches up, you put the site into maintenance mode (if necessary) to switch from your master to this new patched slave as the new master database.
The only thing I can't remember is exactly when you point the other slaves at the new master so that they also get the alter applied. One caveat to this process, we typically use this to roll alter patches before the code needs the change, or after the code has changed to no longer reference the columns/keys.
You might want to look at pt-online-schema-change from Percona toolkit. Essentially what it does is:
Works very well for single instance databases, but might be quite tricky if you use replication and you can't afford stopping slaves and rebuilding them later.
There's also a nice webinar about this here.
PS: I know it's an old question, just answering in case someone hits this via search engine.
I tested various strategies to speed up one alter table. Eventually I got about 10x speed increase in my particular case. The results may or may not apply to your situation. However, based on this I would suggest experimenting with InnoDB log file/buffer size parameters.
In short, only increasing innodb_log_file_size and innodb_log_buffer_size had a measurable effect (Be careful! Changing innodb_log_file_size is risky. Look below for more info).
Based on the rough write data rate (iostat) and cpu activity the bottleneck was io based, but not data throughput. In the faster 500s runs the write throughput is at least in the same ballpark that you would expect from the hard disk.
Tried performance optimizations:
Changing innodb_log_file_size can be dangerous. See http://www.mysqlperformanceblog.com/2011/07/09/how-to-change-innodb_log_file_size-safely/ The technique (file move) explained in the link worked nicely in my case.
Also see http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/ and http://www.mysqlperformanceblog.com/2008/11/21/how-to-calculate-a-good-innodb-log-file-size/ for information about innodb and tuning log sizes. One drawback of larger log files is longer recovery time after crash.
Test runs and rough timings:
Testing details: Table: InnoDB, 6M rows, 2.8G on disk, single file (innodb_file_per_table option), primary key is 1 integer, +2 unque constraints/indices, 8 columns, avg. row length 218 bytes. Server: Ubuntu 12.04, x86_64, virtual machine, 8 cores, 16GB, sata consumer grade disk, no raid, no database activity, minuscule other process activity, minuscule activity in other and much smaller virtual machines. Mysql 5.1.53. The initial server config is pretty default except for increased innodb_buffer_pool_size of 1400M. The alter table adds 2 small columns. I didn't clock the raw alter table, but instead experimented with equivalent load data infile statement, finally I did the straight alter table and got comparable result.
This question is related to at least following questions:
I really don't know how to optimize that, but it's usually a good practice to put the site in offline mode before doing such updates.
Then, you can run your DB scripts at, say, 3 am, so it shouldn't matter much if downtime's a big longer than ideal.
You need to think about your requirements a little more carefully.
At the simplest level, the "fastest" way to get the table changed is to do it in as few ALTER TABLE
statements as possible, preferably one. This is because MySQL copies a table's data to change the schema and making fifteen changes whilst make a single copy is obviously (and really is) faster than copying the table fifteen times, making one change at a time.
But I suspect you're asking how to do this change with the least amount of downtime. The way I would do that, you basically synthesize the way a non-block ALTER TABLE
would work. But it has some additional requirements:
AUTO_INCREMENT
field for the former. The basic technique is as you suggested, i.e. using an INSERT INTO ... SELECT ...
. At least you're in front because you're starting with an InnoDB table, so the SELECT
won't block. I recommend doing the ALTER TABLE
on the new, empty table, which will save MySQL copying all the data again, which will mean you need to list all the fields correctly in the INSERT INTO ... SELECT ...
statement. Then you can do a simple RENAME
statement to swap it over. Then you need to do another INSERT INTO ... SELECT ... WHERE ...
and perhaps an UPDATE ... INNER JOIN ... WHERE ...
to grab all the modified data. You need to do the INSERT
and UPDATE
quickly or your code will starting adding new rows and updates to your snapshot which will interfere with your update. (You won't have this problem if you can put your app into maintenence mode for a few minutes from before the RENAME
.)
Apart from that, there are some key and buffer related settings you can change for just one session that may help the main data move. Things like read_rnd_buffer_size
and read_buffer_size
would be useful to increase.