How to improve INSERT INTO ... SELECT locking behavior

48,917

Solution 1

The answer to this question is much easier now: - Use Row Based Replication and Read Committed isolation level.

The locking you were experiencing disappears.

Longer explaination: http://harrison-fisk.blogspot.com/2009/02/my-favorite-new-feature-of-mysql-51.html

Solution 2

You can set binlog format like that:

SET GLOBAL binlog_format = 'ROW';

Edit my.cnf if you want to make if permanent:

[mysqld]
binlog_format=ROW

Set isolation level for the current session before you run your query:

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
INSERT INTO t1 SELECT ....;

If this doesn't help you should try setting isolation level server wide and not only for the current session:

SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

Edit my.cnf if you want to make if permanent:

[mysqld]
transaction-isolation = READ-UNCOMMITTED

You can change READ-UNCOMMITTED to READ-COMMITTED which is a better isolation level.

Solution 3

Everyone using Innodb tables probably got use to the fact Innodb tables perform non locking reads, meaning unless you use some modifiers such as LOCK IN SHARE MODE or FOR UPDATE, SELECT statements will not lock any rows while running.

This is generally correct, however there a notable exception – INSERT INTO table1 SELECT * FROM table2. This statement will perform locking read (shared locks) for table2 table. It also applies to similar tables with where clause and joins. It is important for tables which is being read to be Innodb – even if writes are done in MyISAM table.

So why was this done, being pretty bad for MySQL Performance and concurrency ?

The reason is – replication. In MySQL before 5.1 replication is statement based which means statements replied on the master should cause the same effect as on the slave. If Innodb would not locking rows in source table other transaction could modify the row and commit before transaction which is running INSERT .. SELECT statement. This would make this transaction to be applied on the slave before INSERT… SELECT statement and possibly result in different data than on master. Locking rows in the source table while reading them protects from this effect as other transaction modifies rows before INSERT … SELECT had chance to access it it will also be modified in the same order on the slave. If transaction tries to modify the row after it was accessed and so locked by INSERT … SELECT, transaction will have to wait until statement is completed to make sure it will be executed on the slave in proper order. Gets pretty complicated ? Well all you need to know it had to be done fore replication to work right in MySQL before 5.1.

In MySQL 5.1 this as well as few other problems should be solved by row based replication. I’m however yet to give it real stress tests to see how well it performs :)

One more thing to keep into account – INSERT … SELECT actually performs read in locking mode and so partially bypasses versioning and retrieves latest committed row. So even if you’re operation in REPEATABLE-READ mode, this operation will be performed in READ-COMMITTED mode, potentially giving different result compared to what pure SELECT would give. This by the way applies to SELECT .. LOCK IN SHARE MODE and SELECT … FOR UPDATE as well.

One my ask what is if I’m not using replication and have my binary log disabled ? If replication is not used you can enable innodb_locks_unsafe_for_binlog option, which will relax locks which Innodb sets on statement execution, which generally gives better concurrency. However as the name says it makes locks unsafe fore replication and point in time recovery, so use innodb_locks_unsafe_for_binlog option with caution.

Note disabling binary logs is not enough to trigger relaxed locks. You have to set innodb_locks_unsafe_for_binlog=1 as well. This is done so enabling binary log does not cause unexpected changes in locking behavior and performance problems. You also can use this option with replication sometimes, if you really know what you’re doing. I would not recommend it unless it is really needed as you might not know which other locks will be relaxed in future versions and how it would affect your replication.

Solution 4

Disclaimer: I'm not very experienced with databases, and I'm not sure if this idea is workable. Please correct me if it's not.

How about setting up a secondary equivalent table HighlyContentiousTableInInnoDb2, and creating AFTER INSERT etc. triggers in the first table which keep the new table updated with the same data. Now you should be able to lock HighlyContentiousTableInInnoDb2, and only slow down the triggers of the primary table, instead of all queries.

Potential problems:

  • 2 x data stored
  • Additional work for all inserts, updates and deletes
  • Might not be transactionally sound

Solution 5

If you can allow some anomalies you can change ISOLATION LEVEL to the least strict one - READ UNCOMMITTED. But during this time someone is allowed to read from ur destination table. Or you can lock destination table manually (I assume mysql is giving this functionality?).

Or alternatively you can use READ COMMITTED, which should not lock source table also. But it also locks inserted rows in destination table till commit.

I would choose second one.

Share:
48,917
Artem
Author by

Artem

Updated on December 13, 2020

Comments

  • Artem
    Artem over 3 years

    In our production database, we ran the following pseudo-code SQL batch query running every hour:

    INSERT INTO TemporaryTable
        (SELECT FROM HighlyContentiousTableInInnoDb
         WHERE allKindsOfComplexConditions are true)
    

    Now this query itself does not need to be fast, but I noticed it was locking up HighlyContentiousTableInInnoDb, even though it was just reading from it. Which was making some other very simple queries take ~25 seconds (that's how long that other query takes).

    Then I discovered that InnoDB tables in such a case are actually locked by a SELECT! https://www.percona.com/blog/2006/07/12/insert-into-select-performance-with-innodb-tables/

    But I don't really like the solution in the article of selecting into an OUTFILE, it seems like a hack (temporary files on filesystem seem sucky). Any other ideas? Is there a way to make a full copy of an InnoDB table without locking it in this way during the copy. Then I could just copy the HighlyContentiousTable to another table and do the query there.

  • Artem
    Artem about 14 years
    Does this actually work? I would think the View would do exactly the same work...
  • Artem
    Artem about 14 years
    This is an interesting direction. dev.mysql.com/doc/refman/5.0/en/set-transaction.html The destination table is a temporary (non-replicated) one anyways, so I think READ COMMITTED is the way to go. I'd like to try this out.
  • Philipp Andre
    Philipp Andre almost 14 years
    if you edit/read fields using a view, your DBMS has to lock the fields as well as you access them directly. the only difference is that it does not lock the whole row (with all columns) but only the columns used by the view. if your transactions use disjoint columns, than this could really help you. (who the hell gave -1 to this answer?)
  • Artem
    Artem almost 14 years
    I have now tried it and it seems to work without problems! I now do: SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO TemporaryTable SELECT ... FROM HighlyContentiousTableInInnoDb; And this does not lock HighlyContentiousTableInInnoDb. I don't know of any disadvantages to using this as opposed to the SELECT INTO OUTFILE method. I don't replicate this TemporaryTable, so I think I should not have issues.
  • StrangeElement
    StrangeElement about 11 years
    I have noticed a decrease of over 50% in execution time when used on an update query using sub-selects, nice bonus.
  • Ryan
    Ryan over 8 years
    Just added a +50 bounty on this question for a more detailed, step-by-step answer of the above.
  • Rick James
    Rick James over 8 years
    A VIEW is just syntactic sugar around a SELECT; no performance gain.
  • Timo Huovinen
    Timo Huovinen about 8 years
    there are no snapshots in mysql
  • Leo Galleguillos
    Leo Galleguillos almost 4 years
    To check your current isolation level, you can run SELECT @@TX_ISOLATION;
  • John
    John over 2 years
    I've been fighting with this many times, having to copy tables terrabytes large in mysql can take weeks to finish if relying on it's flawed internal performance. You need to do it with 20-50 threads. But I had deadlocks even when randomly accessing rows so I had to go over the table 2-3 times. "SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED" That did it finally, no deadlocks anymore and no randomization required.