Why would you NOT set IGNORE_DUP_KEY to ON?

sql-server tsql database-design

13,595

Solution 1

Whenever there is a deviation from the "normal" in the database , you probably want to know about it.

You kept the key unique because of some constraint arising out of business need that dictated it. The database is just keeping up it's side of the deal saying that 'hey you wanted this to be unique but now you are saying something contrary. Make up your mind'

If that is intentional you can ask database to shut up by using IGNORE_DUP_KEY :)

Solution 2

I guess it might be because the defaults are set to prevent any invalid transactions from failing silently. Everything considered, I'd prefer to choose when to ignore unintended consequences, but please let me know unless I say otherwise.

Example: If I'm depositing my paycheck, I'd like someone to notice if my employer accidentally issued duplicate check numbers.

Solution 3

chances are the duplicates are in there by mistake anyway.

I bet they are! They are bugs. You certainly want to know about them! Turing on IGNORE_DUP_KEY by default is...

hiding bugs...
...by corrupting data. (Of course the database stays physically consistent, but the data is still wrong from a business logic standpoint.)

This is a terrible choice by any standard.

Turn it on under special circumstances and then get rid of it as fast as you can so you don't accidentally hide bugs.

Solution 4

It can be used as a sanity check. If you know that there should be no conflicts leave it off and it will fail fast on bugs. OTOH for ad-hoc console sessions, I see your point.

Solution 5

I'm having a many-to-many relation. I have a product-to-category table with unique index, no other data than prodid and katid in table.

So I'm setting IGNORE_DUP_KEY on the unique (prodid,katid) index.

So I can safely say "add product (1,2,3) to category (a,b,c)" without having to check if some products are in some categories already; I only care about the end result.

View more solutions

13,595

Author by

Wayne Molina

I'm a self-taught developer living in the Tampa Bay area who has mainly spent his career as the one-man "IT Guy" so I have a broader knowledge of business and software than the average developer. My primary language is C# but I also have dabbled in Ruby and Python. Currently working at a good company doing C# and SQL work with a little extras here and there. I pride myself on software craftsmanship and am a big proponent of the SOLID principles and good software design. My ultimate goal is to advance to a leadership role where I can focus on quality and delivering results that solve problems, empower users to be more efficient, and remain stable and easy to expand on later.

Updated on June 15, 2022

Comments

Wayne Molina almost 2 years

IGNORE_DUP_KEY = ON basically tells SQL Server to insert non-duplicate rows, but silently ignore any duplicates; the default behavior is to raise an error and abort the entire transaction when there are duplicates in a column that doesn't allow them.

I've worked with a ton of data that normally has at least one duplicate when there shouldn't be, so I like to make use of UNIQUE constraints when I know a value shouldn't have dups; however when I try to bulk load data the last thing I want is for it to get 90% done and then suddenly run into a duplicate and error the whole thing out (Yes, I know the obvious solution is to make sure there are no duplicates, but sometimes I'm just handed a spreadsheet filled with data and told to load it ASAP).

So, what is the reason for having the default be OFF, and why wouldn't you want it to be on all the time so that any non-dup entries succeed while you don't have to worry about any duplicates; chances are the duplicates are in there by mistake anyway.

Is it related to performance, or something else? This seems like a great idea, but there's got to be some reason why it's not the default behavior.

Mainly, is there a good reason not to use this that I should be aware of, or should it be up for evaluating on a case-by-case basis?
- Triynko over 4 years
  
  dba.stackexchange.com/questions/247090/…
Wayne Molina over 15 years

That's true, as well. I guess the best reason would be that it errs on the side of caution and raises an error unless you tell it otherwise.
Francois Bourgeois over 11 years

But both the documentation and the question states that you cannot corrupt your data, because even with IGNORE_DUP_KEY = ON duplicates are not allowed.
usr over 11 years

@FrancoisBourgeois not sure what you are saying. Duplicates are only possible with a non-unique index, of course. They are logical application bugs, not bugs in SQL Server.
LJ Wilson over 10 years

One comment, setting the ignore ON isn't without consequences. If you have an identity column, you will see skips in the identity for each insert that was ignored because of a duplicate.
user3071296 almost 10 years

It is not corrupting data because it doesn't insert the duplicate value, it merely skips over it without eliciting a warning.
usr almost 10 years

@user3071296 The database is physically consistent but the data inserted is silently wrong. This is data corruption from the applications point of view. When the app destroys data it is data corruption.
user3071296 almost 10 years

So if you do an "insert into ... select ... from ..." to a table that has ignore duplicate keys on, what data is silently wrong and corrupted?
usr almost 10 years

When you accidentally insert duplicate keys due to a bug you want a crash to find the bug. Who knows what the bug is doing. Might affect data integrity besides duplication. I'll draw an analogy to JavaScript: When you have a type error in your program this indicates a logic error. You don't want the runtime to fix it up. For example by automatically converting between strings and ints (and just about everything else). You want a crash to have your logic error exposed.
user3071296 almost 10 years

You are looking at specific examples and drawing sweeping conclusions. As stated above, ignore duplicate keys is just fine in many different scenarios (e.g. select from insert into). In other scenarios, you are correct, there are times when you want to know if something you explicitly forbid happened. However, inherently speaking, there is nothing wrong with having ignore duplicate keys on. As long as you realize what that means. Anyway, I understand what you're saying, and there is validity to it. It just doesn't apply to ALL situations, which you seem to be implying.
usr almost 10 years

@user3071296 ok I think we have an understanding now. I agree with your points. I'll let the answer stand as is because this is how I feel about the issue.
eremmel over 7 years

Having this option on on non-clustered indexes gives a penalty on performance Maintaining Unique Indexes with IGNORE_DUP_KEY and can result in severe range locking with concurrent insert batches Range lock (RS-U) due to IGNORE_DUP_KEY index option. So when you want to insert many rows in one shot and ignore duplicates, apply it only on the clustered key.
Pam Lahoud over 6 years

@eremmel You just saved my bacon, thank you for that comment! I've been banging my head against a wall for the past few days trying to figure out why I was getting Range locks without serializable isolation when I got this little tickle in my brain about ignore_dup_key causing perf issues. Quick search led me to this post, you rock! I only wish this was a full answer so it was more obvious :)