When to use EXCEPT as opposed to NOT EXISTS in Transact SQL?

sql sql-server sql-server-2005 performance tsql

71,032

Solution 1

EXCEPT treats NULL values as matching.

This query:

WITH    q (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  1
        ),
        p (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  2
        )
SELECT  *
FROM    q
WHERE   value NOT IN
        (
        SELECT  value
        FROM    p
        )

will return an empty rowset.

This query:

WITH    q (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  1
        ),
        p (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  2
        )
SELECT  *
FROM    q
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    p
        WHERE   p.value = q.value
        )

will return

NULL
1

, and this one:

WITH    q (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  1
        ),
        p (value) AS
        (
        SELECT  NULL
        UNION ALL
        SELECT  2
        )
SELECT  *
FROM    q
EXCEPT
SELECT  *
FROM    p

will return:

Recursive reference is also allowed in EXCEPT clause in a recursive CTE, though it behaves in a strange way: it returns everything except the last row of a previous set, not everything except the whole previous set:

WITH    q (value) AS
        (
        SELECT  1
        UNION ALL
        SELECT  2
        UNION ALL
        SELECT  3
        ),
        rec (value) AS
        (
        SELECT  value
        FROM    q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  value
                FROM    q
                EXCEPT
                SELECT  value
                FROM    rec
                ) q2
        )
SELECT  TOP 10 *
FROM    rec

---
1
2
3
-- original set
1
2
-- everything except the last row of the previous set, that is 3
1
3
-- everything except the last row of the previous set, that is 2
1
2
-- everything except the last row of the previous set, that is 3, etc.
1

SQL Server developers must just have forgotten to forbid it.

Solution 2

I have done a lot of analysis of except, not exists, not in and left outer join. Generally the left outer join is the fastest for finding missing rows, especially joining on a primary key. Not In can be very fast if you know it will be a small list returned in the select.

I use EXCEPT a lot to compare what is being returned when rewriting code. Run the old code saving results. Run new code saving results and then use except to capture all differences. It is a very quick and easy way to find differences, especially when needing to get all differences including null. Very good for on the fly easy coding.

But, every situation is different. I say to every developer I have ever mentored. Try it. Do timings all different ways. Try it, time it, do it.

Solution 3

EXCEPT compares all (paired)columns of two full-selects. NOT EXISTS compares two or more tables accoding to the conditions specified in WHERE clause in the sub-query following NOT EXISTS keyword.

EXCEPT can be rewritten by using NOT EXISTS. (EXCEPT ALL can be rewritten by using ROW_NUMBER and NOT EXISTS.)

Got this from here

Solution 4

If your query is fine tuned then there is no performance difference b/w using of EXCEPT clause and NOT EXIST/NOT IN.. first time when I ran EXCEPT after changing my correlated query into it.. I was surprised because it returned with the result just in 7 secs while correlated query was returning in 22 secs.. then I used distinct clause in my correlated query and reran.. it also returned in 7 secs.. so EXCEPT is good when you don't know or don't have time to fine tuned your query otherwise both are same performance wise..

Solution 5

There is no accounting for SQL server's execution plans. I have always found when having performance issues that it was utterly arbitrary (from a user's perspective, I'm sure the algorithm writers would understand why) when one syntax made a better execution plan rather than another.

In this case, something about the query parameter comparison allows SQL to figure out a shortcut that it couldn't from a straight select statement. I'm sure that is a deficiency in the algorithm. In other words, you could logically interpolate the same thing, but the algorithm doesn't make that translation on an exists query. Sometimes that is because an algorithm that could reliably figure it out would take longer to execute than the query itself, or at least the algorithm designer thought so.

View more solutions

71,032

Author by

Joe Pineda

Developing professionally since 11 years ago. VB6 and SQL Server, mostly. Use Linux, C++ and Perl at home.

Updated on July 09, 2022

Comments

Joe Pineda almost 2 years

I just recently learned of the existence of the new "EXCEPT" clause in SQL Server (a bit late, I know...) through reading code written by a co-worker. It truly amazed me!

But then I have some questions regarding its usage: when is it recommended to be employed? Is there a difference, performance-wise, between using it versus a correlated query employing "AND NOT EXISTS..."?

After reading EXCEPT's article in the BOL I thought it was just a shorthand for the second option, but was surprised when I rewrote a couple queries using it (so they had the "AND NOT EXISTS" syntax much more familiar to me) and then checked the execution plans - surprise! The EXCEPT version had a shorter execution plan, and executed faster, also. Is this always so?

So I'd like to know: what are the guidelines for using this powerful tool?
Magnus about 7 years

Also except returns a Set, hence distinct values.