Solutions for INSERT OR UPDATE on SQL Server

sql sql-server database insert upsert

790,594

Solution 1

don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.
When multiple threads will try to perform Insert-or-update you can easily get primary key violation.

Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.

To avoid deadlocks and PK violations you can use something like this:

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
   update table set ...
   where key = @key
end
else
begin
   insert into table (key, ...)
   values (@key, ...)
end
commit tran

begin tran
   update table with (serializable) set ...
   where key = @key

   if @@rowcount = 0
   begin
      insert into table (key, ...) values (@key,..)
   end
commit tran

Solution 2

See my detailed answer to a very similar previous question

@Beau Crawford's is a good way in SQL 2005 and below, though if you're granting rep it should go to the first guy to SO it. The only problem is that for inserts it's still two IO operations.

MS Sql2008 introduces merge from the SQL:2003 standard:

merge tablename with(HOLDLOCK) as target
using (values ('new value', 'different value'))
    as source (field1, field2)
    on target.idfield = 7
when matched then
    update
    set field1 = source.field1,
        field2 = source.field2,
        ...
when not matched then
    insert ( idfield, field1, field2, ... )
    values ( 7,  source.field1, source.field2, ... )

Now it's really just one IO operation, but awful code :-(

Solution 3

Do an UPSERT:

UPDATE MyTable SET FieldA=@FieldA WHERE Key=@Key

IF @@ROWCOUNT = 0
   INSERT INTO MyTable (FieldA) VALUES (@FieldA)

http://en.wikipedia.org/wiki/Upsert

Solution 4

Many people will suggest you use MERGE, but I caution you against it. By default, it doesn't protect you from concurrency and race conditions any more than multiple statements, and it introduces other dangers:

Even with this "simpler" syntax available, I still prefer this approach (error handling omitted for brevity):

BEGIN TRANSACTION;

UPDATE dbo.table WITH (UPDLOCK, SERIALIZABLE) 
  SET ... WHERE PK = @PK;

IF @@ROWCOUNT = 0
BEGIN
  INSERT dbo.table(PK, ...) SELECT @PK, ...;
END

COMMIT TRANSACTION;

Please stop using this UPSERT anti-pattern

A lot of folks will suggest this way:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

IF EXISTS (SELECT 1 FROM dbo.table WHERE PK = @PK)
BEGIN
  UPDATE ...
END
ELSE
BEGIN
  INSERT ...
END
COMMIT TRANSACTION;

But all this accomplishes is ensuring you may need to read the table twice to locate the row(s) to be updated. In the first sample, you will only ever need to locate the row(s) once. (In both cases, if no rows are found from the initial read, an insert occurs.)

Others will suggest this way:

BEGIN TRY
  INSERT ...
END TRY
BEGIN CATCH
  IF ERROR_NUMBER() = 2627
    UPDATE ...
END CATCH

However, this is problematic if for no other reason than letting SQL Server catch exceptions that you could have prevented in the first place is much more expensive, except in the rare scenario where almost every insert fails. I prove as much here:

Solution 5

IF EXISTS (SELECT * FROM [Table] WHERE ID = rowID)
UPDATE [Table] SET propertyOne = propOne, property2 . . .
ELSE
INSERT INTO [Table] (propOne, propTwo . . .)

Edit:

Alas, even to my own detriment, I must admit the solutions that do this without a select seem to be better since they accomplish the task with one less step.

View more solutions

790,594

Author by

Chris Cudmore

I'm the sole developer for a publishing house, doing on-line courses and exams. We have a web designer, who typically does layouts and courses, but I do anything that involves programming logic.

Updated on December 28, 2021

Comments

Chris Cudmore over 2 years
Assume a table structure of MyTable(KEY, datafield1, datafield2...).

Often I want to either update an existing record, or insert a new record if it doesn't exist.

Essentially:
```
IF (key exists)
  run update command
ELSE
  run insert command
```
What's the best performing way to write this?
Chris Cudmore over 15 years

That's two calls to the database.
Clint Ecker over 15 years

I don't see a problem with that.
Luke Bennett over 15 years

Question asked for most performant solution rather than the safest. Whilst a transaction adds security to the process, it also adds an overhead.
Luke Bennett over 15 years

Sure, but if we're going to start talking about application stability there's plenty of other things to think about as well.
Kev over 15 years

It's two calls to the DB that's the problem, you end doubling the number of roundtrips to the DB. If the app hits the db with lots of inserts/updates it'll hurt performance. UPSERT is a better strategy.
Eric Z Beard over 15 years

I still like this one better. The upsert seems more like programming by side effect, and I have never seen the piddly little clustered index seek of that initial select to cause performance problems in a real database.
Peter Dutton over 15 years

In Oracle, issuing a MERGE statement I think locks the table. Does the same happen in SQL*Server?
dburges about 15 years

I have never needed to use a cursor to do this with large datasets. You just need an update that updates the records that match and an insert with a select instead of a values clause that left joins to the table.
Triynko about 14 years

Primary key violations should not occur if you have the proper unique index constraints applied. The whole point of the constraint is to prevent duplicate rows from every happening. It doesn't matter how many threads are trying to insert, the database will serialize as necessary to enforce the constraint... and if it doesn't, then the engine is worthless. Of course, wrapping this in a serialized transaction would make this more correct and less susceptible to deadlocks or failed inserts.
Jean Vincent almost 14 years

Both these methods can still fail. If two concurrent threads do the same on the same row, the first one will succeed, but the second insert will fail because of a primary key violation. A transaction does not guaranty that the insert will succeed even if the update failed because the record existed. To guaranty that any number of concurrent transaction will succeed you MUST use a lock.
EBarr over 13 years

@Triynko, I think @Sam Saffron meant that if two+ threads interleave in the right sequence then sql server will throw an error indicating a primary key violation would have occurred. Wrapping it in a serializable transaction is the correct way to prevent errors in the above set of statements.
EBarr over 13 years

@aku any reason you used table hints ("with(xxxx)") as opposed to "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" just before your BEGIN TRAN ?
EBarr over 13 years

MERGE is susceptible to race conditions (see weblogs.sqlteam.com/dang/archive/2009/01/31/…) unless you make it hold certian locks. Also, take a look at MERGE's performance in SQL Profiler ... i find that it is typcially slower and generates more reads than alternative solutions.
Eric Weilnau over 13 years

@EBarr - Thanks for the link on the locks. I have updated my answer to include the suggest locking hint.
Keith almost 13 years

@Ian Boyd - yeah, that's the SQL:2003 standard's syntax, not the upsert that just about all the other DB providers decided to support instead. The upsert syntax is a far nicer way to do this, so at the very least MS should have supported it too - it's not like it's the only non standard keyword in T-SQL
Kip over 12 years

the documentation doesn't show a where clause as a valid syntax: technet.microsoft.com/en-us/library/bb510625.aspx
eglasius over 12 years

any comment on the lock hint in other answers? (will find out soon, but if it's the recommended way, I recommend adding it on the answer)
Keith over 12 years

@eglasius this should be wrapped in a transaction, but that kinda goes as read for any SQL insert or update. Using merge is one operation, so there's no chance of another action clashing.
Seph over 12 years

See here weblogs.sqlteam.com/dang/archive/2009/01/31/… for answer on how to prevent race conditions from causing errors that can occur even when using MERGE syntax.
Seph over 12 years

Even if you have a primary key that is a auto-increment, your concern will then be any unique constraints that might be on the table.
Keith over 12 years

@Seph that's a real surprise - somewhat of a fail by Microsoft there :-S I guess that means you need a HOLDLOCK for merge operations in high concurrency situations.
CashCow about 12 years

@Jean Vincent. If there are two processes trying to set different values to the same record there is a race condition anyway. Essentially you will not guarantee which value is there as a post-condition.
CashCow about 12 years

the database should take care of primary key issues. What you are saying is that if update fails and another process gets there first with an insert your insert will fail. In that case you have a race condition anyway. Locking won't change the fact that the post-condition will be that one of the processes that tries writing will get the value.
Jean Vincent about 12 years

@CashCow, the last wins, this is what INSERT or UPDATE is supposed to do: the first one inserts, the second updates the record. Adding a lock allow this to happen in a very short time-frame, preventing an error.
theycallmemorty about 11 years

@Keith I disagree with your assertion that this is ugly code :)
moodboom about 11 years

@theycallmemorty there are THREE separate fields lists to maintain, this is an unnecessary maintenance nightmare, and that makes it ugly. :-)
Keith about 11 years

@moodboom - not just that; both UPSERT and UPDATE have fields in pairs Set [field] = @value. INSERT has a list of fields and a list of values, fine for a couple of fields, but a right pain once you have lots of fields.
moodboom about 11 years

@Keith yep, MERGE is just gross when you just want an upsert, which is so simple and common. Not sure what ANSI and MSSQL designers were thinking. Maybe in another three or four years.. sigh...
Triynko over 10 years

This answer really needs updated to account for the comment by Seph about it not being thread-safe without a HOLDLOCK. According to the linked post, MERGE implicitly takes out an update lock, but releases it before inserting rows, which can cause a race condition and primary key violations on insert. By using HOLDLOCK, the locks are kept until after the insert occurs.
Adir D over 10 years

Also check out mssqltips.com/sqlservertip/3074/…
Adir D over 10 years

There are other things to worry about with MERGE: mssqltips.com/sqlservertip/3074/…
Adir D over 10 years

If you just do an update then insert without any locking or elevated isolation, then two users could try to pass the same data back (I wouldn't consider it a bug in the middle tier if two users tried to submit the exact same information at the same time - depends a lot on context, doesn't it?). They both enter the update, which returns 0 rows for both, then they both try to insert. One wins, the other gets an exception. This is what people are usually trying to avoid.
Adir D over 10 years

This may have been relevant in ancient versions of SQL Server, but modern versions have statement-level compilation. Forks etc. are not an issue, and using separate procedures for these things does not solve any of the issues inherent in making the choice between an update and an insert anyway...
Adir D over 10 years

...and doing the insert first (knowing that it will fail sometimes) is expensive for SQL Server. sqlperformance.com/2012/08/t-sql-queries/error-handling
swasheck over 10 years

this is a comment. in the absence of any actual example code this is just like many other comments on the site.
Imran Qadir Baksh - Baloch about 10 years

What about inserting/updating FROM a tem table which insert/update many records?
Matt McCabe over 8 years

Very old, but an example would be nice.
jk7 about 8 years

Interesting idea, but incorrect syntax. The SELECT needs a FROM <table_source>, and a TOP 1 (unless the chosen table_source has only 1 row).
Kristen about 8 years

Thanks. I've changed it to a NOT EXISTS. There will only ever be one matching row because of the test for "key" as per O/P (although that may need to be a multi-part key :) )
Christian almost 8 years

Also see this the answer from @zvolkov of this question stackoverflow.com/questions/1488355/…
Adir D almost 8 years

@user960567 Well, UPDATE target SET col = tmp.col FROM target INNER JOIN #tmp ON <key clause>; INSERT target(...) SELECT ... FROM #tmp AS t WHERE NOT EXISTS (SELECT 1 FROM target WHERE key = t.key);
Imran Qadir Baksh - Baloch almost 8 years

nice replied after more than 2 years :)
Adir D almost 8 years

@user960567 Sorry, I don't always catch comment notifications in real time.
palswim over 7 years

An editor did add the HOLDLOCK directive, but the OP rolled back the edits.
Keith over 7 years

@palswim they did, but they also reformatted it and added other stuff. I tend to rollback such changes on principle - I'd missed the one worthwhile change in there, sorry. If they'd just added the missing with directive I'd have let it stay, and I've done that now.
niico about 7 years

it also creates a race condition no?
Admin over 6 years

I always thought use locking hints are bad, and we should let Microsoft Internal engine dictate locks. Is this the apparent exception to the rule?
Zeek2 almost 5 years

I am guessing lack of locking?
Victor Sanchez almost 5 years

No lack locking... I use "TRAN". Default sql-server transactions have locking.
likejudo about 4 years

@RamenChef I don't understand. Where are the WHEN MATCHED clauses?
RamenChef about 4 years

@likejudo I didn't write this; I only revised it. Ask the user that wrote the post.
iokevins over 3 years

@AaronBertrand Thank you for this. Re: 2020-09-23 edit, your preferred approach above wraps the transaction with "ISOLATION LEVEL SERIALIZABLE", with no UPDLOCK table hint. The linked article dated 2020-09-02 uses two table hints "UPDATE dbo.t WITH (UPDLOCK, SERIALIZABLE)" and no "ISOLATION LEVEL SERIALIZABLE" (which makes sense). Sorry if dumb question, but are these equivalent, or is one approach better than the other? I'm guessing your more recent article represents your current preferred approach. Thank you.
Adir D over 3 years

@iokevins No difference that I can think of. I’m actually torn in terms of preference, while I prefer having the hint at the query level, I prefer the opposite when we’re talking about, say, applying NOLOCK hints to every table in the query (in that case I much prefer a single SET statement to fix later).
Adir D over 2 years

@EricZBeard It's not about performance (though it's not always a seek that you're performing redundantly, depending on what you're checking to indicate a duplicate). The real problem is the opportunity the additional operation opens up for race conditions and deadlocks (I explain why in this post).