Solutions for INSERT OR UPDATE on SQL Server

790,594

Solution 1

don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.
When multiple threads will try to perform Insert-or-update you can easily get primary key violation.

Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.

To avoid deadlocks and PK violations you can use something like this:

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
   update table set ...
   where key = @key
end
else
begin
   insert into table (key, ...)
   values (@key, ...)
end
commit tran

or

begin tran
   update table with (serializable) set ...
   where key = @key

   if @@rowcount = 0
   begin
      insert into table (key, ...) values (@key,..)
   end
commit tran

Solution 2

See my detailed answer to a very similar previous question

@Beau Crawford's is a good way in SQL 2005 and below, though if you're granting rep it should go to the first guy to SO it. The only problem is that for inserts it's still two IO operations.

MS Sql2008 introduces merge from the SQL:2003 standard:

merge tablename with(HOLDLOCK) as target
using (values ('new value', 'different value'))
    as source (field1, field2)
    on target.idfield = 7
when matched then
    update
    set field1 = source.field1,
        field2 = source.field2,
        ...
when not matched then
    insert ( idfield, field1, field2, ... )
    values ( 7,  source.field1, source.field2, ... )

Now it's really just one IO operation, but awful code :-(

Solution 3

Do an UPSERT:

UPDATE MyTable SET FieldA=@FieldA WHERE Key=@Key

IF @@ROWCOUNT = 0
   INSERT INTO MyTable (FieldA) VALUES (@FieldA)

http://en.wikipedia.org/wiki/Upsert

Solution 4

Many people will suggest you use MERGE, but I caution you against it. By default, it doesn't protect you from concurrency and race conditions any more than multiple statements, and it introduces other dangers:

Even with this "simpler" syntax available, I still prefer this approach (error handling omitted for brevity):

BEGIN TRANSACTION;

UPDATE dbo.table WITH (UPDLOCK, SERIALIZABLE) 
  SET ... WHERE PK = @PK;

IF @@ROWCOUNT = 0
BEGIN
  INSERT dbo.table(PK, ...) SELECT @PK, ...;
END

COMMIT TRANSACTION;

A lot of folks will suggest this way:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

IF EXISTS (SELECT 1 FROM dbo.table WHERE PK = @PK)
BEGIN
  UPDATE ...
END
ELSE
BEGIN
  INSERT ...
END
COMMIT TRANSACTION;

But all this accomplishes is ensuring you may need to read the table twice to locate the row(s) to be updated. In the first sample, you will only ever need to locate the row(s) once. (In both cases, if no rows are found from the initial read, an insert occurs.)

Others will suggest this way:

BEGIN TRY
  INSERT ...
END TRY
BEGIN CATCH
  IF ERROR_NUMBER() = 2627
    UPDATE ...
END CATCH

However, this is problematic if for no other reason than letting SQL Server catch exceptions that you could have prevented in the first place is much more expensive, except in the rare scenario where almost every insert fails. I prove as much here:

Solution 5

IF EXISTS (SELECT * FROM [Table] WHERE ID = rowID)
UPDATE [Table] SET propertyOne = propOne, property2 . . .
ELSE
INSERT INTO [Table] (propOne, propTwo . . .)

Edit:

Alas, even to my own detriment, I must admit the solutions that do this without a select seem to be better since they accomplish the task with one less step.

Share:
790,594
Chris Cudmore
Author by

Chris Cudmore

I'm the sole developer for a publishing house, doing on-line courses and exams. We have a web designer, who typically does layouts and courses, but I do anything that involves programming logic.

Updated on December 28, 2021

Comments

  • Chris Cudmore
    Chris Cudmore over 2 years

    Assume a table structure of MyTable(KEY, datafield1, datafield2...).

    Often I want to either update an existing record, or insert a new record if it doesn't exist.

    Essentially:

    IF (key exists)
      run update command
    ELSE
      run insert command
    

    What's the best performing way to write this?

  • Chris Cudmore
    Chris Cudmore over 15 years
    That's two calls to the database.
  • Clint Ecker
    Clint Ecker over 15 years
    I don't see a problem with that.
  • Luke Bennett
    Luke Bennett over 15 years
    Question asked for most performant solution rather than the safest. Whilst a transaction adds security to the process, it also adds an overhead.
  • Luke Bennett
    Luke Bennett over 15 years
    Sure, but if we're going to start talking about application stability there's plenty of other things to think about as well.
  • Kev
    Kev over 15 years
    It's two calls to the DB that's the problem, you end doubling the number of roundtrips to the DB. If the app hits the db with lots of inserts/updates it'll hurt performance. UPSERT is a better strategy.
  • Eric Z Beard
    Eric Z Beard over 15 years
    I still like this one better. The upsert seems more like programming by side effect, and I have never seen the piddly little clustered index seek of that initial select to cause performance problems in a real database.
  • Peter Dutton
    Peter Dutton over 15 years
    In Oracle, issuing a MERGE statement I think locks the table. Does the same happen in SQL*Server?
  • dburges
    dburges about 15 years
    I have never needed to use a cursor to do this with large datasets. You just need an update that updates the records that match and an insert with a select instead of a values clause that left joins to the table.
  • Triynko
    Triynko about 14 years
    Primary key violations should not occur if you have the proper unique index constraints applied. The whole point of the constraint is to prevent duplicate rows from every happening. It doesn't matter how many threads are trying to insert, the database will serialize as necessary to enforce the constraint... and if it doesn't, then the engine is worthless. Of course, wrapping this in a serialized transaction would make this more correct and less susceptible to deadlocks or failed inserts.
  • Jean Vincent
    Jean Vincent almost 14 years
    Both these methods can still fail. If two concurrent threads do the same on the same row, the first one will succeed, but the second insert will fail because of a primary key violation. A transaction does not guaranty that the insert will succeed even if the update failed because the record existed. To guaranty that any number of concurrent transaction will succeed you MUST use a lock.
  • EBarr
    EBarr over 13 years
    @Triynko, I think @Sam Saffron meant that if two+ threads interleave in the right sequence then sql server will throw an error indicating a primary key violation would have occurred. Wrapping it in a serializable transaction is the correct way to prevent errors in the above set of statements.
  • EBarr
    EBarr over 13 years
    @aku any reason you used table hints ("with(xxxx)") as opposed to "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" just before your BEGIN TRAN ?
  • EBarr
    EBarr over 13 years
    MERGE is susceptible to race conditions (see weblogs.sqlteam.com/dang/archive/2009/01/31/…) unless you make it hold certian locks. Also, take a look at MERGE's performance in SQL Profiler ... i find that it is typcially slower and generates more reads than alternative solutions.
  • Eric Weilnau
    Eric Weilnau over 13 years
    @EBarr - Thanks for the link on the locks. I have updated my answer to include the suggest locking hint.
  • Keith
    Keith almost 13 years
    @Ian Boyd - yeah, that's the SQL:2003 standard's syntax, not the upsert that just about all the other DB providers decided to support instead. The upsert syntax is a far nicer way to do this, so at the very least MS should have supported it too - it's not like it's the only non standard keyword in T-SQL
  • Kip
    Kip over 12 years
    the documentation doesn't show a where clause as a valid syntax: technet.microsoft.com/en-us/library/bb510625.aspx
  • eglasius
    eglasius over 12 years
    any comment on the lock hint in other answers? (will find out soon, but if it's the recommended way, I recommend adding it on the answer)
  • Keith
    Keith over 12 years
    @eglasius this should be wrapped in a transaction, but that kinda goes as read for any SQL insert or update. Using merge is one operation, so there's no chance of another action clashing.
  • Seph
    Seph over 12 years
    See here weblogs.sqlteam.com/dang/archive/2009/01/31/… for answer on how to prevent race conditions from causing errors that can occur even when using MERGE syntax.
  • Seph
    Seph over 12 years
    Even if you have a primary key that is a auto-increment, your concern will then be any unique constraints that might be on the table.
  • Keith
    Keith over 12 years
    @Seph that's a real surprise - somewhat of a fail by Microsoft there :-S I guess that means you need a HOLDLOCK for merge operations in high concurrency situations.
  • CashCow
    CashCow about 12 years
    @Jean Vincent. If there are two processes trying to set different values to the same record there is a race condition anyway. Essentially you will not guarantee which value is there as a post-condition.
  • CashCow
    CashCow about 12 years
    the database should take care of primary key issues. What you are saying is that if update fails and another process gets there first with an insert your insert will fail. In that case you have a race condition anyway. Locking won't change the fact that the post-condition will be that one of the processes that tries writing will get the value.
  • Jean Vincent
    Jean Vincent about 12 years
    @CashCow, the last wins, this is what INSERT or UPDATE is supposed to do: the first one inserts, the second updates the record. Adding a lock allow this to happen in a very short time-frame, preventing an error.
  • theycallmemorty
    theycallmemorty about 11 years
    @Keith I disagree with your assertion that this is ugly code :)
  • moodboom
    moodboom about 11 years
    @theycallmemorty there are THREE separate fields lists to maintain, this is an unnecessary maintenance nightmare, and that makes it ugly. :-)
  • Keith
    Keith about 11 years
    @moodboom - not just that; both UPSERT and UPDATE have fields in pairs Set [field] = @value. INSERT has a list of fields and a list of values, fine for a couple of fields, but a right pain once you have lots of fields.
  • moodboom
    moodboom about 11 years
    @Keith yep, MERGE is just gross when you just want an upsert, which is so simple and common. Not sure what ANSI and MSSQL designers were thinking. Maybe in another three or four years.. sigh...
  • Triynko
    Triynko over 10 years
    This answer really needs updated to account for the comment by Seph about it not being thread-safe without a HOLDLOCK. According to the linked post, MERGE implicitly takes out an update lock, but releases it before inserting rows, which can cause a race condition and primary key violations on insert. By using HOLDLOCK, the locks are kept until after the insert occurs.
  • Adir D
    Adir D over 10 years
  • Adir D
    Adir D over 10 years
    There are other things to worry about with MERGE: mssqltips.com/sqlservertip/3074/…
  • Adir D
    Adir D over 10 years
    If you just do an update then insert without any locking or elevated isolation, then two users could try to pass the same data back (I wouldn't consider it a bug in the middle tier if two users tried to submit the exact same information at the same time - depends a lot on context, doesn't it?). They both enter the update, which returns 0 rows for both, then they both try to insert. One wins, the other gets an exception. This is what people are usually trying to avoid.
  • Adir D
    Adir D over 10 years
    This may have been relevant in ancient versions of SQL Server, but modern versions have statement-level compilation. Forks etc. are not an issue, and using separate procedures for these things does not solve any of the issues inherent in making the choice between an update and an insert anyway...
  • Adir D
    Adir D over 10 years
    ...and doing the insert first (knowing that it will fail sometimes) is expensive for SQL Server. sqlperformance.com/2012/08/t-sql-queries/error-handling
  • swasheck
    swasheck over 10 years
    this is a comment. in the absence of any actual example code this is just like many other comments on the site.
  • Imran Qadir Baksh - Baloch
    Imran Qadir Baksh - Baloch about 10 years
    What about inserting/updating FROM a tem table which insert/update many records?
  • Matt McCabe
    Matt McCabe over 8 years
    Very old, but an example would be nice.
  • jk7
    jk7 about 8 years
    Interesting idea, but incorrect syntax. The SELECT needs a FROM <table_source>, and a TOP 1 (unless the chosen table_source has only 1 row).
  • Kristen
    Kristen about 8 years
    Thanks. I've changed it to a NOT EXISTS. There will only ever be one matching row because of the test for "key" as per O/P (although that may need to be a multi-part key :) )
  • Christian
    Christian almost 8 years
    Also see this the answer from @zvolkov of this question stackoverflow.com/questions/1488355/…
  • Adir D
    Adir D almost 8 years
    @user960567 Well, UPDATE target SET col = tmp.col FROM target INNER JOIN #tmp ON <key clause>; INSERT target(...) SELECT ... FROM #tmp AS t WHERE NOT EXISTS (SELECT 1 FROM target WHERE key = t.key);
  • Imran Qadir Baksh - Baloch
    Imran Qadir Baksh - Baloch almost 8 years
    nice replied after more than 2 years :)
  • Adir D
    Adir D almost 8 years
    @user960567 Sorry, I don't always catch comment notifications in real time.
  • palswim
    palswim over 7 years
    An editor did add the HOLDLOCK directive, but the OP rolled back the edits.
  • Keith
    Keith over 7 years
    @palswim they did, but they also reformatted it and added other stuff. I tend to rollback such changes on principle - I'd missed the one worthwhile change in there, sorry. If they'd just added the missing with directive I'd have let it stay, and I've done that now.
  • niico
    niico about 7 years
    it also creates a race condition no?
  • Admin
    Admin over 6 years
    I always thought use locking hints are bad, and we should let Microsoft Internal engine dictate locks. Is this the apparent exception to the rule?
  • Zeek2
    Zeek2 almost 5 years
    I am guessing lack of locking?
  • Victor Sanchez
    Victor Sanchez almost 5 years
    No lack locking... I use "TRAN". Default sql-server transactions have locking.
  • likejudo
    likejudo about 4 years
    @RamenChef I don't understand. Where are the WHEN MATCHED clauses?
  • RamenChef
    RamenChef about 4 years
    @likejudo I didn't write this; I only revised it. Ask the user that wrote the post.
  • iokevins
    iokevins over 3 years
    @AaronBertrand Thank you for this. Re: 2020-09-23 edit, your preferred approach above wraps the transaction with "ISOLATION LEVEL SERIALIZABLE", with no UPDLOCK table hint. The linked article dated 2020-09-02 uses two table hints "UPDATE dbo.t WITH (UPDLOCK, SERIALIZABLE)" and no "ISOLATION LEVEL SERIALIZABLE" (which makes sense). Sorry if dumb question, but are these equivalent, or is one approach better than the other? I'm guessing your more recent article represents your current preferred approach. Thank you.
  • Adir D
    Adir D over 3 years
    @iokevins No difference that I can think of. I’m actually torn in terms of preference, while I prefer having the hint at the query level, I prefer the opposite when we’re talking about, say, applying NOLOCK hints to every table in the query (in that case I much prefer a single SET statement to fix later).
  • Adir D
    Adir D over 2 years
    @EricZBeard It's not about performance (though it's not always a seek that you're performing redundantly, depending on what you're checking to indicate a duplicate). The real problem is the opportunity the additional operation opens up for race conditions and deadlocks (I explain why in this post).