Optimize mySql for faster alter table add column

36,364

Solution 1

I faced a very similar situation in the past and i improve the performance of the operation in this way :

  1. Create a new table (using the structure of the current table) with the new column(s) included.
  2. execute a INSERT INTO new_table (column1,..columnN) SELECT (column1,..columnN) FROM current_table;
  3. rename the current table
  4. rename the new table using the name of the current table.

Solution 2

ALTER TABLE in MySQL is actually going to create a new table with new schema, then re-INSERT all the data and delete the old table. You might save some time by creating the new table, loading the data and then renaming the table.

From "High Performance MySQL book" (the percona guys):

The usual trick for loading MyISAM table efficiently is to disable keys, load the data and renalbe the keys:

mysql> ALTER TABLE test.load_data DISABLE KEYS;
-- load data
mysql> ALTER TABLE test.load_data ENABLE KEYS;

Solution 3

Well, I would recommend using latest Percona MySQL builds plus since there is the following note in MySQL manual

In other cases, MySQL creates a temporary table, even if the data wouldn't strictly need to be copied. For MyISAM tables, you can speed up the index re-creation operation (which is the slowest part of the alteration process) by setting the myisam_sort_buffer_size system variable to a high value.

You can do ALTER TABLE DISABLE KEYS first, then add column and then ALTER TABLE ENABLE KEYS. I don't see anything can be done here.

BTW, can't you go MongoDB? It doesn't rebuild anything when you add column.

Solution 4

Maybe you can remove the index before alter the table because what is take most of the time to build is the index?

Solution 5

Combining some of the comments on the other answers, this was the solution that worked for me (MySQL 5.6):

  1. create table mytablenew like mytable;
  2. alter table mytablenew add column col4a varchar(12) not null after col4;
  3. alter table mytablenew drop index index1, drop index index2,...drop index indexN;
  4. insert into mytablenew (col1,col2,...colN) select col1,col2,...colN from mytable;
  5. alter table mytablenew add index index1 (col1), add index index2 (col2),...add index indexN (colN);
  6. rename table mytable to mytableold, mytablenew to mytable

On a 75M row table, dropping the indexes before the insert caused the query to complete in 24 minutes rather than 43 minutes.

Other answers/comments have insert into mytablenew (col1) select (col1) from mytable, but this results in ERROR 1241 (21000): Operand should contain 1 column(s) if you have the parenthesis in the select query.

Other answers/comments have insert into mytablenew select * from mytable;, but this results in ERROR 1136 (21S01): Column count doesn't match value count at row 1 if you've already added a column.

Share:
36,364
Andrew
Author by

Andrew

Chief Technology Officer at Quick Hit, Inc. Consultant for Marketing, Internet, SEO, SEM, Architecture, Business Strategy

Updated on July 14, 2022

Comments

  • Andrew
    Andrew almost 2 years

    I have a table that has 170,002,225 rows with about 35 columns and two indexes. I want to add a column. The alter table command took about 10 hours. Neither the processor seemed busy during that time nor were there excessive IO waits. This is on a 4 way high performance box with tons of memory.

    Is this the best I can do? Is there something I can look at to optimize the add column in tuning of the db?

  • Darius Jahandarie
    Darius Jahandarie about 13 years
    Some notes... 1. CREATE TABLE new_table LIKE table; 2. INSERT INTO new_table SELECT * FROM table; 3&4. You can get a pseudo-atomic rename if you combine steps 3 & 4 into this: RENAME TABLE table = old_table, table = new_table;
  • Darius Jahandarie
    Darius Jahandarie about 13 years
    The index will either not be rebuilt or only be rebuilt once depending on the ALTER TABLE. Removing it and adding it back would not save any time.
  • Micromega
    Micromega about 13 years
    I was pointing to the binary search and that it is expensive to build an index. I've never tried this on 170 Billions rows.
  • sreimer
    sreimer about 13 years
    This is the procedure I use as well. Often, we'll script the select into to slow down (drip feed) in order to allow normal database operations to continue normally.
  • Vladislav Rastrusny
    Vladislav Rastrusny about 13 years
    there are only 170 millions of rows, not billions ;)
  • neocanable
    neocanable about 13 years
    disable keys and enable keys just works for myisam engine,there's 170,002,225 rows,i think it's using innodb!
  • Vladislav Rastrusny
    Vladislav Rastrusny about 13 years
    @Neo maximum number of rows for MyISAM is ~2^32 and this is significantly more than 170 millions. With --with-big-tables option this limit is raised to ~2^64 which is even more.
  • FoneyOp
    FoneyOp about 13 years
    MyISAM tables have problems with concurrency due to read locking. InnoDB is clearly the way to go here.
  • Vladislav Rastrusny
    Vladislav Rastrusny about 13 years
    @FoneyOp I doubt switching to InnoDB will speed up ALTER TABLE
  • FoneyOp
    FoneyOp about 13 years
    Agreed. InnoDB is almost always slower for ALTER TABLE for reasons mentions here. The comment was in reference to MyISM max rows being inside 170M rows.
  • Mindey I.
    Mindey I. almost 13 years
    How do you handle writes that happen to the old table while you are copying it over (i.e. before the rename)?
  • jchook
    jchook over 11 years
    We put our site into maintenance mode for this kind of heavy operation. Preventing IO during the migration will also improve performance.
  • hafichuk
    hafichuk almost 11 years
    How do you handle FK's in this case (InnoDB)?
  • Michael Kruglos
    Michael Kruglos over 10 years
    @Phpdna, when indexes are built, it's doing a sort, which is O(nlog n), not a binary search, which is O(log n).
  • Arthur Kushman
    Arthur Kushman about 10 years
    This answer is the most proper one, in practice, Ive tested on - table that has >22.000.000 rows. What I did - dropped all indexes (there were 3), then added the field (that takes 4.23 on HDD) and recreate indexes, which took less then 2 mins each. Darius Jahandarie - isn't right, he do not know, in practice what he is talking about, the answer of RRUZ also seems to work but in practice - INSERT INTO ... SELECT * FROM` takes hours where drop indexes-add column-recreate indexes seems really fast decision. PS I would rather lock this table before to prevent slow-down queries.
  • Kevin Borders
    Kevin Borders about 10 years
    If you need to keep your database online while making a schema change, this post explains how to do it safely with triggers and delta tables: m.facebook.com/note.php?note_id=430801045932
  • Kevin Borders
    Kevin Borders about 10 years
    You can handle foreign keys using SET FOREIGN_KEY_CHECKS = 0
  • Sebastian Ganslandt
    Sebastian Ganslandt over 8 years
    Adding to the list of success stories, to keep the post balanced, this doesn't always improve performance. In our case, the insert into NewTable select * from OldTable was running at about 600m/second while the alter table was doing about 1g/second.
  • Sebastian Ganslandt
    Sebastian Ganslandt over 8 years
    percona.com/doc/percona-toolkit/2.2/… is a nifty tool if you want to be able to make these changes while still being online and limit the impact on production performance.
  • jerrymouse
    jerrymouse over 7 years
    If new table contains N+1 columns, you should write INSERT INTO new_table (column1,..columnN) SELECT (column1,..columnN) FROM current_table;. Otherwise you'll get: #1136 - Column count doesn't match value count at row 1 error
  • georgecj11
    georgecj11 over 7 years
    mysql4webdev.blogspot.in/2014/05/… This is the process I follow and have tried a python script for my use case
  • Rick James
    Rick James almost 7 years
    MyISAM's limit has not been 2^32 for more than a decade.