Solutions for INSERT OR UPDATE on SQL Server
Solution 1
don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.
When multiple threads will try to perform Insert-or-update you can easily
get primary key violation.
Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.
To avoid deadlocks and PK violations you can use something like this:
begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
update table set ...
where key = @key
end
else
begin
insert into table (key, ...)
values (@key, ...)
end
commit tran
or
begin tran
update table with (serializable) set ...
where key = @key
if @@rowcount = 0
begin
insert into table (key, ...) values (@key,..)
end
commit tran
Solution 2
See my detailed answer to a very similar previous question
@Beau Crawford's is a good way in SQL 2005 and below, though if you're granting rep it should go to the first guy to SO it. The only problem is that for inserts it's still two IO operations.
MS Sql2008 introduces merge
from the SQL:2003 standard:
merge tablename with(HOLDLOCK) as target
using (values ('new value', 'different value'))
as source (field1, field2)
on target.idfield = 7
when matched then
update
set field1 = source.field1,
field2 = source.field2,
...
when not matched then
insert ( idfield, field1, field2, ... )
values ( 7, source.field1, source.field2, ... )
Now it's really just one IO operation, but awful code :-(
Solution 3
Do an UPSERT:
UPDATE MyTable SET FieldA=@FieldA WHERE Key=@Key IF @@ROWCOUNT = 0 INSERT INTO MyTable (FieldA) VALUES (@FieldA)
http://en.wikipedia.org/wiki/Upsert
Solution 4
Many people will suggest you use MERGE
, but I caution you against it. By default, it doesn't protect you from concurrency and race conditions any more than multiple statements, and it introduces other dangers:
Even with this "simpler" syntax available, I still prefer this approach (error handling omitted for brevity):
BEGIN TRANSACTION;
UPDATE dbo.table WITH (UPDLOCK, SERIALIZABLE)
SET ... WHERE PK = @PK;
IF @@ROWCOUNT = 0
BEGIN
INSERT dbo.table(PK, ...) SELECT @PK, ...;
END
COMMIT TRANSACTION;
A lot of folks will suggest this way:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
IF EXISTS (SELECT 1 FROM dbo.table WHERE PK = @PK)
BEGIN
UPDATE ...
END
ELSE
BEGIN
INSERT ...
END
COMMIT TRANSACTION;
But all this accomplishes is ensuring you may need to read the table twice to locate the row(s) to be updated. In the first sample, you will only ever need to locate the row(s) once. (In both cases, if no rows are found from the initial read, an insert occurs.)
Others will suggest this way:
BEGIN TRY
INSERT ...
END TRY
BEGIN CATCH
IF ERROR_NUMBER() = 2627
UPDATE ...
END CATCH
However, this is problematic if for no other reason than letting SQL Server catch exceptions that you could have prevented in the first place is much more expensive, except in the rare scenario where almost every insert fails. I prove as much here:
- Checking for potential constraint violations before entering TRY/CATCH
- Performance impact of different error handling techniques
Solution 5
IF EXISTS (SELECT * FROM [Table] WHERE ID = rowID)
UPDATE [Table] SET propertyOne = propOne, property2 . . .
ELSE
INSERT INTO [Table] (propOne, propTwo . . .)
Edit:
Alas, even to my own detriment, I must admit the solutions that do this without a select seem to be better since they accomplish the task with one less step.
Chris Cudmore
I'm the sole developer for a publishing house, doing on-line courses and exams. We have a web designer, who typically does layouts and courses, but I do anything that involves programming logic.
Updated on December 28, 2021Comments
-
Chris Cudmore over 2 years
Assume a table structure of
MyTable(KEY, datafield1, datafield2...)
.Often I want to either update an existing record, or insert a new record if it doesn't exist.
Essentially:
IF (key exists) run update command ELSE run insert command
What's the best performing way to write this?
-
Chris Cudmore over 15 yearsThat's two calls to the database.
-
Clint Ecker over 15 yearsI don't see a problem with that.
-
Luke Bennett over 15 yearsQuestion asked for most performant solution rather than the safest. Whilst a transaction adds security to the process, it also adds an overhead.
-
Luke Bennett over 15 yearsSure, but if we're going to start talking about application stability there's plenty of other things to think about as well.
-
Kev over 15 yearsIt's two calls to the DB that's the problem, you end doubling the number of roundtrips to the DB. If the app hits the db with lots of inserts/updates it'll hurt performance. UPSERT is a better strategy.
-
Eric Z Beard over 15 yearsI still like this one better. The upsert seems more like programming by side effect, and I have never seen the piddly little clustered index seek of that initial select to cause performance problems in a real database.
-
Peter Dutton over 15 yearsIn Oracle, issuing a MERGE statement I think locks the table. Does the same happen in SQL*Server?
-
dburges about 15 yearsI have never needed to use a cursor to do this with large datasets. You just need an update that updates the records that match and an insert with a select instead of a values clause that left joins to the table.
-
Triynko about 14 yearsPrimary key violations should not occur if you have the proper unique index constraints applied. The whole point of the constraint is to prevent duplicate rows from every happening. It doesn't matter how many threads are trying to insert, the database will serialize as necessary to enforce the constraint... and if it doesn't, then the engine is worthless. Of course, wrapping this in a serialized transaction would make this more correct and less susceptible to deadlocks or failed inserts.
-
Jean Vincent almost 14 yearsBoth these methods can still fail. If two concurrent threads do the same on the same row, the first one will succeed, but the second insert will fail because of a primary key violation. A transaction does not guaranty that the insert will succeed even if the update failed because the record existed. To guaranty that any number of concurrent transaction will succeed you MUST use a lock.
-
EBarr over 13 years@Triynko, I think @Sam Saffron meant that if two+ threads interleave in the right sequence then sql server will throw an error indicating a primary key violation would have occurred. Wrapping it in a serializable transaction is the correct way to prevent errors in the above set of statements.
-
EBarr over 13 years@aku any reason you used table hints ("with(xxxx)") as opposed to "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE" just before your BEGIN TRAN ?
-
EBarr over 13 yearsMERGE is susceptible to race conditions (see weblogs.sqlteam.com/dang/archive/2009/01/31/…) unless you make it hold certian locks. Also, take a look at MERGE's performance in SQL Profiler ... i find that it is typcially slower and generates more reads than alternative solutions.
-
Eric Weilnau over 13 years@EBarr - Thanks for the link on the locks. I have updated my answer to include the suggest locking hint.
-
Keith almost 13 years@Ian Boyd - yeah, that's the SQL:2003 standard's syntax, not the
upsert
that just about all the other DB providers decided to support instead. Theupsert
syntax is a far nicer way to do this, so at the very least MS should have supported it too - it's not like it's the only non standard keyword in T-SQL -
Kip over 12 yearsthe documentation doesn't show a where clause as a valid syntax: technet.microsoft.com/en-us/library/bb510625.aspx
-
eglasius over 12 yearsany comment on the lock hint in other answers? (will find out soon, but if it's the recommended way, I recommend adding it on the answer)
-
Keith over 12 years@eglasius this should be wrapped in a transaction, but that kinda goes as read for any SQL
insert
orupdate
. Usingmerge
is one operation, so there's no chance of another action clashing. -
Seph over 12 yearsSee here weblogs.sqlteam.com/dang/archive/2009/01/31/… for answer on how to prevent race conditions from causing errors that can occur even when using
MERGE
syntax. -
Seph over 12 yearsEven if you have a primary key that is a auto-increment, your concern will then be any unique constraints that might be on the table.
-
Keith over 12 years@Seph that's a real surprise - somewhat of a fail by Microsoft there :-S I guess that means you need a
HOLDLOCK
for merge operations in high concurrency situations. -
CashCow about 12 years@Jean Vincent. If there are two processes trying to set different values to the same record there is a race condition anyway. Essentially you will not guarantee which value is there as a post-condition.
-
CashCow about 12 yearsthe database should take care of primary key issues. What you are saying is that if update fails and another process gets there first with an insert your insert will fail. In that case you have a race condition anyway. Locking won't change the fact that the post-condition will be that one of the processes that tries writing will get the value.
-
Jean Vincent about 12 years@CashCow, the last wins, this is what INSERT or UPDATE is supposed to do: the first one inserts, the second updates the record. Adding a lock allow this to happen in a very short time-frame, preventing an error.
-
theycallmemorty about 11 years@Keith I disagree with your assertion that this is ugly code :)
-
moodboom about 11 years@theycallmemorty there are THREE separate fields lists to maintain, this is an unnecessary maintenance nightmare, and that makes it ugly. :-)
-
Keith about 11 years@moodboom - not just that; both
UPSERT
andUPDATE
have fields in pairsSet [field] = @value
.INSERT
has a list of fields and a list of values, fine for a couple of fields, but a right pain once you have lots of fields. -
moodboom about 11 years@Keith yep, MERGE is just gross when you just want an upsert, which is so simple and common. Not sure what ANSI and MSSQL designers were thinking. Maybe in another three or four years.. sigh...
-
Triynko over 10 yearsThis answer really needs updated to account for the comment by Seph about it not being thread-safe without a HOLDLOCK. According to the linked post, MERGE implicitly takes out an update lock, but releases it before inserting rows, which can cause a race condition and primary key violations on insert. By using HOLDLOCK, the locks are kept until after the insert occurs.
-
Adir D over 10 yearsAlso check out mssqltips.com/sqlservertip/3074/…
-
Adir D over 10 yearsThere are other things to worry about with MERGE: mssqltips.com/sqlservertip/3074/…
-
Adir D over 10 yearsIf you just do an update then insert without any locking or elevated isolation, then two users could try to pass the same data back (I wouldn't consider it a bug in the middle tier if two users tried to submit the exact same information at the same time - depends a lot on context, doesn't it?). They both enter the update, which returns 0 rows for both, then they both try to insert. One wins, the other gets an exception. This is what people are usually trying to avoid.
-
Adir D over 10 yearsThis may have been relevant in ancient versions of SQL Server, but modern versions have statement-level compilation. Forks etc. are not an issue, and using separate procedures for these things does not solve any of the issues inherent in making the choice between an update and an insert anyway...
-
Adir D over 10 years...and doing the insert first (knowing that it will fail sometimes) is expensive for SQL Server. sqlperformance.com/2012/08/t-sql-queries/error-handling
-
swasheck over 10 yearsthis is a comment. in the absence of any actual example code this is just like many other comments on the site.
-
Imran Qadir Baksh - Baloch about 10 yearsWhat about inserting/updating FROM a tem table which insert/update many records?
-
Matt McCabe over 8 yearsVery old, but an example would be nice.
-
jk7 about 8 yearsInteresting idea, but incorrect syntax. The SELECT needs a FROM <table_source>, and a TOP 1 (unless the chosen table_source has only 1 row).
-
Kristen about 8 yearsThanks. I've changed it to a NOT EXISTS. There will only ever be one matching row because of the test for "key" as per O/P (although that may need to be a multi-part key :) )
-
Christian almost 8 yearsAlso see this the answer from @zvolkov of this question stackoverflow.com/questions/1488355/…
-
Adir D almost 8 years@user960567 Well,
UPDATE target SET col = tmp.col FROM target INNER JOIN #tmp ON <key clause>; INSERT target(...) SELECT ... FROM #tmp AS t WHERE NOT EXISTS (SELECT 1 FROM target WHERE key = t.key);
-
Imran Qadir Baksh - Baloch almost 8 yearsnice replied after more than 2 years :)
-
Adir D almost 8 years@user960567 Sorry, I don't always catch comment notifications in real time.
-
palswim over 7 yearsAn editor did add the HOLDLOCK directive, but the OP rolled back the edits.
-
Keith over 7 years@palswim they did, but they also reformatted it and added other stuff. I tend to rollback such changes on principle - I'd missed the one worthwhile change in there, sorry. If they'd just added the missing
with
directive I'd have let it stay, and I've done that now. -
niico about 7 yearsit also creates a race condition no?
-
Admin over 6 yearsI always thought use locking hints are bad, and we should let Microsoft Internal engine dictate locks. Is this the apparent exception to the rule?
-
Zeek2 almost 5 yearsI am guessing lack of locking?
-
Victor Sanchez almost 5 yearsNo lack locking... I use "TRAN". Default sql-server transactions have locking.
-
likejudo about 4 years@RamenChef I don't understand. Where are the WHEN MATCHED clauses?
-
RamenChef about 4 years@likejudo I didn't write this; I only revised it. Ask the user that wrote the post.
-
iokevins over 3 years@AaronBertrand Thank you for this. Re: 2020-09-23 edit, your preferred approach above wraps the transaction with "ISOLATION LEVEL SERIALIZABLE", with no UPDLOCK table hint. The linked article dated 2020-09-02 uses two table hints "UPDATE dbo.t WITH (UPDLOCK, SERIALIZABLE)" and no "ISOLATION LEVEL SERIALIZABLE" (which makes sense). Sorry if dumb question, but are these equivalent, or is one approach better than the other? I'm guessing your more recent article represents your current preferred approach. Thank you.
-
Adir D over 3 years@iokevins No difference that I can think of. I’m actually torn in terms of preference, while I prefer having the hint at the query level, I prefer the opposite when we’re talking about, say, applying NOLOCK hints to every table in the query (in that case I much prefer a single SET statement to fix later).
-
Adir D over 2 years@EricZBeard It's not about performance (though it's not always a seek that you're performing redundantly, depending on what you're checking to indicate a duplicate). The real problem is the opportunity the additional operation opens up for race conditions and deadlocks (I explain why in this post).