SQL JOIN To Find Records That Don't Have a Matching Record With a Specific Value
Solution 1
Create an index on sa_signatures (type, order_id)
.
This is not necessary to convert the query into a LEFT JOIN
unless sa_signatures
allows nulls in order_id
. With the index, the NOT IN
will perform as well. However, just in case you're curious:
SELECT o.*
FROM sa_order o
LEFT JOIN
sa_signatures s
ON s.order_id = o.order_id
AND s.type = 'administrative director'
WHERE s.type IS NULL
You should pick a NOT NULL
column from sa_signatures
for the WHERE
clause to perform well.
Solution 2
You could replace the [NOT] IN operator with EXISTS for faster performance.
So you'll have:
SELECT * FROM SA_ORDER WHERE NOT EXISTS
(SELECT ORDER_ID FROM SA_SIGNATURES
WHERE TYPE = 'administrative director'
AND ORDER_ID = SA_ORDER.ORDER_ID);
Reason : "When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query."
Source : http://decipherinfosys.wordpress.com/2007/01/21/32/
Brian Reich
Updated on June 27, 2022Comments
-
Brian Reich almost 2 years
I'm trying to speed up some code that I wrote years ago for my employer's purchase authorization app. Basically I have a SLOW subquery that I'd like to replace with a JOIN (if it's faster).
When the director logs into the application he sees a list of purchase requests he has yet to authorize or deny. That list is generated with the following query:
SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN (SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
There are only about 900 records in sa_order and 1800 records in sa_signature and this query still takes about 5 seconds to execute. I've tried using a LEFT JOIN to retrieve records I need, but I've only been able to get sa_order records with NO matching records in sa_signature, and I need sa_order records with "no matching records with a type of 'administrative director'". Your help is greatly appreciated!
The schema for the two tables is as follows:
The tables involved have the following layout:
CREATE TABLE sa_order ( `order_id` BIGINT PRIMARY KEY AUTO_INCREMENT, `order_number` BIGINT NOT NULL, `submit_date` DATE NOT NULL, `vendor_id` BIGINT NOT NULL, `DENIED` BOOLEAN NOT NULL DEFAULT FALSE, `MEMO` MEDIUMTEXT, `year_id` BIGINT NOT NULL, `advisor` VARCHAR(255) NOT NULL, `deleted` BOOLEAN NOT NULL DEFAULT FALSE ); CREATE TABLE sa_signature ( `signature_id` BIGINT PRIMARY KEY AUTO_INCREMENT, `order_id` BIGINT NOT NULL, `signature` VARCHAR(255) NOT NULL, `proxy` BOOLEAN NOT NULL DEFAULT FALSE, `timestamp` TIMESTAMP NOT NULL DEFAULT NOW(), `username` VARCHAR(255) NOT NULL, `type` VARCHAR(255) NOT NULL );
-
Brian Reich over 13 yearsExcellent! After creating those indexes and running an optimize on my table the query execution time went from 5 seconds to .04. Thanks!
-
Quassnoi over 13 yearsthis is only the case if
ORDER_ID
allows nulls. The link you provided isOracle
specific, theMySQL
's optimizer differs much from theOracle
's one.FULLSCAN
here is only becauseOracle
won't indexNULL
columns, unlikeMySQL
.