SQL JOIN To Find Records That Don't Have a Matching Record With a Specific Value

10,384

Solution 1

Create an index on sa_signatures (type, order_id).

This is not necessary to convert the query into a LEFT JOIN unless sa_signatures allows nulls in order_id. With the index, the NOT IN will perform as well. However, just in case you're curious:

SELECT  o.*
FROM    sa_order o
LEFT JOIN
        sa_signatures s
ON      s.order_id = o.order_id
        AND s.type = 'administrative director'
WHERE   s.type IS NULL

You should pick a NOT NULL column from sa_signatures for the WHERE clause to perform well.

Solution 2

You could replace the [NOT] IN operator with EXISTS for faster performance.

So you'll have:

SELECT * FROM SA_ORDER WHERE NOT EXISTS
    (SELECT ORDER_ID FROM SA_SIGNATURES
     WHERE TYPE = 'administrative director'
       AND ORDER_ID = SA_ORDER.ORDER_ID);

Reason : "When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query."

Source : http://decipherinfosys.wordpress.com/2007/01/21/32/

Share:
10,384
Brian Reich
Author by

Brian Reich

Updated on June 27, 2022

Comments

  • Brian Reich
    Brian Reich almost 2 years

    I'm trying to speed up some code that I wrote years ago for my employer's purchase authorization app. Basically I have a SLOW subquery that I'd like to replace with a JOIN (if it's faster).

    When the director logs into the application he sees a list of purchase requests he has yet to authorize or deny. That list is generated with the following query:

    SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN
        (SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
    

    There are only about 900 records in sa_order and 1800 records in sa_signature and this query still takes about 5 seconds to execute. I've tried using a LEFT JOIN to retrieve records I need, but I've only been able to get sa_order records with NO matching records in sa_signature, and I need sa_order records with "no matching records with a type of 'administrative director'". Your help is greatly appreciated!

    The schema for the two tables is as follows:

    The tables involved have the following layout:

    CREATE TABLE sa_order
    (
        `order_id`        BIGINT       PRIMARY KEY AUTO_INCREMENT,
        `order_number`    BIGINT       NOT NULL,
        `submit_date`     DATE         NOT NULL,
        `vendor_id`       BIGINT       NOT NULL,
        `DENIED`          BOOLEAN      NOT NULL DEFAULT FALSE,
        `MEMO`            MEDIUMTEXT,
        `year_id`         BIGINT       NOT NULL,
        `advisor`         VARCHAR(255) NOT NULL,
        `deleted`         BOOLEAN      NOT NULL DEFAULT FALSE
    );
    
    CREATE TABLE sa_signature
    (
        `signature_id`        BIGINT          PRIMARY KEY AUTO_INCREMENT,
        `order_id`            BIGINT          NOT NULL,
        `signature`           VARCHAR(255)    NOT NULL,
        `proxy`               BOOLEAN         NOT NULL DEFAULT FALSE,
        `timestamp`           TIMESTAMP       NOT NULL DEFAULT NOW(),
        `username`            VARCHAR(255)    NOT NULL,
        `type`                VARCHAR(255)    NOT NULL
    );
    
  • Brian Reich
    Brian Reich over 13 years
    Excellent! After creating those indexes and running an optimize on my table the query execution time went from 5 seconds to .04. Thanks!
  • Quassnoi
    Quassnoi over 13 years
    this is only the case if ORDER_ID allows nulls. The link you provided is Oracle specific, the MySQL's optimizer differs much from the Oracle's one. FULLSCAN here is only because Oracle won't index NULL columns, unlike MySQL.