Simple way to automatically convert INSERT statements to UPDATEs?

13,065

Solution 1

Mchl's answer is valid another easy fix would be change 'INSERT' to 'REPLACE'. Both require a simple search / replace operation (I'd use sed). But if this were being run regularly then it would be a good candidate for replication / using timestamps to create a loader file only containing the modified/new records.

mysldump also has a --replace option, so the sed step can be omitted.

Solution 2

You could restore the mysqldump data to a new temporary database, then use the multi-table UPDATE syntax to do the update.

UPDATE mydb.mytable AS dest JOIN tempdb.mytable AS origin USING (prim_key)
SET dest.col1 = origin.col1,
    dest.col2 = origin.col2,
    ...

Then drop the temp database.

Solution 3

It's a matter of adding ON DUPLICATE KEY UPDATE clause

http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

Solution 4

I've also been looking for a solution to what I assumed was a common problem, but haven't been able to find one. It seems like this would be baked in to mysqldump, but based on this feature request in the MySQL boards, it isn't:

https://bugs.mysql.com/bug.php?id=73190

Transcription below:

[3 Jul 2014 21:23] Christopher Schultz
Description:
mysqldump currently supports two different options for dealing with PK collisions:

  --insert-ignore (uses INSERT IGNORE)
  --replace       (uses REPLACE INTO instead of INSERT INTO)

I'd like to request an additional option that uses INSERT ... ON DUPLICATE KEY UPDATE ... when used.

This will allow one database to be used as a source to update another database without the following problems:

Using INSERT IGNORE will ignore any updates to existing rows that are coming from the file being loaded "on top of" an existing database.

Using REPLACE ends up churning the primary index, failing when foreign keys point to the record being updated (REPLACED), or ON DELETE CASCADE wiping-out records in other tables as the records in the target table are "updated" (using REPLACE).

There are two significant downsides to using ON DUPLICATE KEY UPDATE, of course:

1. The dump file will get much bigger, because the bulk-loading syntax for INSERT can no longer be used.
2. The dump file will get much longer, because ON DUPLICATE KEY UPDATE requires that all values be specified twice: once in the VALUES() section and then a second time in the field=value, field=value part after "ON DUPLICATE KEY UPDATE" phrase.

Bug 11422 [http://bugs.mysql.com/bug.php?id=11422] requests the ability to simply say "ON DUPLICATE KEY UPDATE" and allow the engine to use the VALUES to update all fields in the table. Fixing bug 11422 and using the solution here would mitigate both downsides because then extended syntax could (possibly?) be used and the values would not have to be mentioned twice in the dump file.

I have found many posts online about how to do something like this and the responses always steer the person toward --insert-ignore or --replace but there are many scenarios where the above request would be preferable.

How to repeat:
(This is a feature request)
[17 Jul 2015 14:23] Georgi Kodinov
Thank you for the reasonable feature request.

Solution 5

I couldn't believe nobody created a simple tool to convert a set of INSERT lines into a set of UPDATE commands. So I made one of my own. You can test it in this fiddle, where you have input and output text boxes.

WARNING! This is not fail-proof, there are a bunch of assumptions. For example, it can fail if you have commas in your values.

I also assume that the first field is a unique key.

function substrBetween(str,before,after){
    var arr = str.split(before);
    arr.shift();
    str = arr.join(before);
    arr = str.split(after);
    str = arr.shift();
    return str;
}

var insertQuery = $('#query').val();

var lines = insertQuery.split('\n');
var header = lines[0];
var tableName = substrBetween(header,"INTO `","`");
var varNames = substrBetween(header,"(`","`)").split("`, `");

var out = "";
for (i=1; i<lines.length; i++){
  var line = lines[i];
  if (line[line.length -1] == ";"){
    line = line.slice(0, -1) + ',';
  }
  var values = substrBetween(line,"(","),").split(", ");
  out += "UPDATE `"+tableName+"` SET ";
  for (j=1; j<values.length; j++){
    out += "`"+varNames[j]+"`="+values[j]+", ";
  }
  out = out.slice(0, out.length-2);
  out += " WHERE `"+varNames[0]+"`="+values[0]+";\n";
}
return out;

Example INPUT:

INSERT INTO `devices` (`name`, `idGroup1`, `idGroup2`, `label`) VALUES
('3703-001', 16, 5, 'Meter BB #1'),
('3703-002', 12, 8, 'Meter CC #2'),
('3703-003', 12, 0, 'Meter #3'),
('3703-004', 12, 24, 'Meter building F');

Corresponding OUTPUT:

UPDATE `devices` SET `idGroup1`=16, `idGroup2`=5, `label`='Meter BB #1' WHERE `name`='3703-001';
UPDATE `devices` SET `idGroup1`=12, `idGroup2`=8, `label`='Meter CC #2' WHERE `name`='3703-002';
UPDATE `devices` SET `idGroup1`=12, `idGroup2`=0, `label`='Meter #3' WHERE `name`='3703-003';
UPDATE `devices` SET `idGroup1`=12, `idGroup2`=24, `label`='Meter building F' WHERE `name`='3703-004';
Share:
13,065

Related videos on Youtube

Tom
Author by

Tom

Updated on May 07, 2022

Comments

  • Tom
    Tom almost 2 years

    I want to use the output of mysqldump to update entries in a live database. I don't want to delete the entries first, simple update statements are fine. Is there a simple way to convert the output of mysqldump which contains INSERT statements to the corresponding UPDATE statements?

    It seems such a basic feature, so I'm sure someone created a tool or came up with a method to do it quickly, so people don't have to reinvent the wheel all the time by everyone writing their own scripts for this.

    Edit: I'm looking for a generic solution, not one where I have to enumerate the actual table columns by hand. It's a generic problem, so I think there should be a table independent solution for it.

  • Tom
    Tom over 13 years
    Yes, but can it be done automatically? Can mysqldump output such a format? The point is not creating my own script for this, because it's a generic problem, not specific to my database.
  • XTF
    XTF about 3 years
    Not entirely correct, bulk-loading syntax can still be used.
  • siliconrockstar
    siliconrockstar about 3 years
    ^ please expand on that, IDK if this is correct or not, it's literally copy/paste from the bug report/feature request