In SQL / MySQL, what is the difference between "ON" and "WHERE" in a join statement?

71,387

Solution 1

WHERE is a part of the SELECT query as a whole, ON is a part of each individual join.

ON can only refer to the fields of previously used tables.

When there is no actual match against a record in the left table, LEFT JOIN returns one record from the right table with all fields set to NULLS. WHERE clause then evaluates and filter this.

In your query, only the records from gifts without match in 'sentgifts' are returned.

Here's the example

gifts

1   Teddy bear
2   Flowers

sentgifts

1   Alice
1   Bob

---
SELECT  *
FROM    gifts g
LEFT JOIN
        sentgifts sg
ON      g.giftID = sg.giftID

---

1  Teddy bear   1     Alice
1  Teddy bear   1     Bob
2  Flowers      NULL  NULL    -- no match in sentgifts

---
SELECT  *
FROM    gifts g
LEFT JOIN
        sentgifts sg
ON      g.giftID = sg.giftID
WHERE   sg.giftID IS NULL

---

2  Flowers      NULL  NULL    -- no match in sentgifts

As you can see, no actual match can leave a NULL in sentgifts.id, so only the gifts that had not ever been sent are returned.

Solution 2

The ON clause defines the relationship between the tables.

The WHERE clause describes which rows you are interested in.

Many times you can swap them and still get the same result, however this is not always the case with a left outer join.

  • If the ON clause fails you still get a row with columns from the left table but with nulls in the columns from the right table.
  • If the WHERE clause fails you won't get that row at all.

Solution 3

When using INNER JOIN, ON and WHERE will have the same result. So,

select *
from Table1 t1
inner join Table2 t2 on t1.id = t2.id
where t1.Name = 'John'

will have the exact same output as

select *
from Table1 t1
inner join Table2 t2 on t1.id = t2.id
    and t1.Name = 'John'

As you have noted, this is not the case when using OUTER JOIN. What query plan gets built is dependent on the database platform as well as query specifics, and is subject to change, so making decisions on that basis alone is not going to give a guaranteed query plan.

As a rule of thumb, you should use columns that join your tables in ON clauses and columns that are used for filtering in WHERE clauses. This provides the best readability.

Solution 4

Though the results are same, the 'ON' make the join first and then retrieve the data of the joined set. The retrieval is faster and load is less. But using 'WHERE' cause the two result sets to be fetched first and then apply the condition. So you know what is preferred.

Solution 5

  • ON is applied to the set used for creating the permutations of each record as a part of the JOIN operation
  • WHERE specifies the filter applied after the JOIN operation

In effect, ON replaces each field that does not satisfy its condition with a NULL. Given the example by @Quassnoi

gifts

1   Teddy bear
2   Flowers

sentgifts

1   Alice
1   Bob

---
SELECT  *
FROM    gifts g
LEFT JOIN
        sentgifts sg
ON      g.giftID = sg.giftID

---

The LEFT JOIN permutations would have been calculated for the following collections if there was no ON condition:

{ 'Teddy bear': {'ALICE', 'Bob'}, 'Flowers': {'ALICE', 'Bob'} }

with the g.giftID = sg.giftID ON condition, this is the collections that will be used for creating the permutations:

{ 'Teddy bear': {'ALICE', 'Bob'}, 'Flowers': {NULL, NULL} }

which in effect is:

{ 'Teddy bear': {'ALICE', 'Bob'}, 'Flowers': {NULL} }

and so results in the LEFT JOIN of:

Teddy bear Alice
Teddy bear Bob
Flowers    NULL

and for a FULL OUTER JOIN you would have:

{ 'Teddy bear': {'ALICE', 'Bob'}, 'Flowers': {NULL} } for LEFT JOIN and { 'ALICE': {'Teddy bear', NULL}, 'Flowers': {'Teddy bear', NULL} } for RIGHT JOIN:

Teddy bear Alice
Teddy bear Bob
Flowers    NULL

If you also had a condition such as ON g.giftID = 1 it would be

{ NULL: {'ALICE', 'Bob'}, 'Flowers': {NULL} }

which for LEFT JOIN would result in

Flowers NULL

and for a FULL OUTER JOIN would result in { NULL: {'ALICE', 'Bob'}, 'Flowers': {NULL} } for LEFT JOIN and { 'ALICE': {NULL, NULL}, 'Flowers': {NULL, NULL} } for RIGHT JOIN

NULL    Alice
NULL    Bob
Flowers NULL

Note MySQL does not have a FULL OUTER JOIN and you need to apply UNION to LEFT JOIN and RIGHT JOIN

Share:
71,387
nonopolarity
Author by

nonopolarity

I started with Apple Basic and 6502 machine code and Assembly, then went onto Fortran, Pascal, C, Lisp (Scheme), microcode, Perl, Java, JavaScript, Python, Ruby, PHP, and Objective-C. Originally, I was going to go with an Atari... but it was a big expense for my family... and after months of me nagging, my dad agreed to buy an Apple ][. At that time, the Pineapple was also available. The few months in childhood seem to last forever. A few months nowadays seem to pass like days. Those days, a computer had 16kb or 48kb of RAM. Today, the computer has 16GB. So it is in fact a million times. If you know what D5 AA 96 means, we belong to the same era.

Updated on July 08, 2022

Comments

  • nonopolarity
    nonopolarity almost 2 years

    The following statements give the same result (one is using on, and the other using where):

    mysql> select * from gifts INNER JOIN sentGifts ON gifts.giftID = sentGifts.giftID;
    mysql> select * from gifts INNER JOIN sentGifts WHERE gifts.giftID = sentGifts.giftID;
    

    I can only see in a case of a Left Outer Join finding the "unmatched" cases:
    (to find out the gifts that were never sent by anybody)

    mysql> select name from gifts LEFT OUTER JOIN sentgifts 
               ON gifts.giftID = sentgifts.giftID 
               WHERE sentgifts.giftID IS NULL;
    

    In this case, it is first using on, and then where. Does the on first do the matching, and then where does the "secondary" filtering? Or is there a more general rule of using on versus where? Thanks.