How does Left Join / IS NULL eliminate records which are there in one table and not in the other?
Solution 1
This could be explained with the following
mysql> select * from table1 ;
+------+------+
| id | val |
+------+------+
| 1 | 10 |
| 2 | 30 |
| 3 | 40 |
+------+------+
3 rows in set (0.00 sec)
mysql> select * from table2 ;
+------+------+
| id | t1id |
+------+------+
| 1 | 1 |
| 2 | 2 |
+------+------+
2 rows in set (0.00 sec)
Here table1.id <-> table2.t1id
Now when we do a left join
with the joining key and if the left table is table1 then it will get all the data from table1 and in non-matching record on table2 will be set to null
mysql> select t1.* , t2.t1id from table1 t1
left join table2 t2 on t2.t1id = t1.id ;
+------+------+------+
| id | val | t1id |
+------+------+------+
| 1 | 10 | 1 |
| 2 | 30 | 2 |
| 3 | 40 | NULL |
+------+------+------+
3 rows in set (0.00 sec)
See that table1.id = 3 does not have a value in table2 so its set as null When you apply the where condition it will do further filtering
mysql> select t1.* , t2.t1id from table1 t1
left join table2 t2 on t2.t1id = t1.id where t2.t1id is null;
+------+------+------+
| id | val | t1id |
+------+------+------+
| 3 | 40 | NULL |
+------+------+------+
1 row in set (0.00 sec)
Solution 2
Let's assume r-table is employees, and r_table is computers. Some employees don't have computers. Some computers are not assigned to anyone yet.
-
Inner join:
SELECT l.*, r.* FROM employees l JOIN computers r ON r.id = l.comp_id
gives you the list of all employees who HAVE a computer, and the info about computer assigned for each of them. The employees without a computer will NOT appear on this list.
-
Left join:
SELECT l.*, r.* FROM employees l LEFT JOIN computers r ON r.id = l.comp_id
gives you the list of ALL employees. The employees with computer will show the computer info. The employees without computers will appear with NULLs instead of computer info.
-
Finally
SELECT l.*, r.* FROM employees l LEFT JOIN computers r ON r.id = l.comp_id WHERE r.id IS NULL
Left join with the WHERE clause will start with the same list as the left join (2), but then it will keep only those employees that do not have corresponding information in the computer table, that is, employees without the computers.
I this case, selecting anything from the
r
table will be just nulls, so you can leave those fields out and select only stuff from thel
table:SELECT l.* FROM ...
Try this sequence of selects and observe the output. Each next step builds on the previous one.
Please let me know if this explanation is understandable, or you'd like me to elaborate some more.
EDITED TO ADD: Here's sample code to create the two tables used above:
CREATE TABLE employees
( id INT NOT NULL PRIMARY KEY,
name VARCHAR(20),
comp_id INT);
INSERT INTO employees (id, name, comp_id) VALUES (1, 'Becky', 1);
INSERT INTO employees (id, name, comp_id) VALUES (2, 'Anne', 7);
INSERT INTO employees (id, name, comp_id) VALUES (3, 'John', 3);
INSERT INTO employees (id, name) VALUES (4, 'Bob');
CREATE TABLE computers
( id INT NOT NULL PRIMARY KEY,
os VARCHAR(20) );
INSERT INTO computers (id, os) VALUES (1,'Windows 7');
INSERT INTO computers (id, os) VALUES (2,'Windows XP');
INSERT INTO computers (id, os) VALUES (3,'Unix');
INSERT INTO computers (id, os) VALUES (4,'Windows 7');
There are 4 employees. Becky and John have computers. Anne and Bob do not have a computer. (Anne, has a comp_id 7, which doesn't correspond to any row in computers table - so, she doesn't really have a computer.)
Solution 3
the left returns all the row from the left table
ON r.value = l.value
if there is no r.value for a l.value then the r is empty
r.value is null will be true
Solution 4
This is based on the difference between INNER JOIN
and LEFT JOIN
. When you use an INNER JOIN
, the result only contains rows that match between the two tables (based on the ON
condition).
But when you use a LEFT JOIN
, you also get rows in the result for all rows in the first table that don't have a match in the second table. In these cases, all the result columns from the second table are filled with NULL
as a placeholder.
The WHERE r.value IS NULL
test then matches these rows. So the final result only contains the rows with no match.
Solution 5
SELECT * FROM a LEFT JOIN b ON a.v = b.v ... WHERE b.id IS NULL
it's super simple logic:
- select from table "a"
- join table "b" (if matching rows exists - ON a.v=b.v otherwise use NULL's as values)
- WHERE condition: if something is missing (b.id IS NULL) = OK.
Related videos on Youtube
StrugglingCoder
Updated on July 13, 2022Comments
-
StrugglingCoder almost 2 years
I am having a tough time to understand why does
LEFT JOIN
/IS NULL
eliminate records which are there in one table and not in the other. Here is an exampleSELECT l.id, l.value FROM t_left l LEFT JOIN t_right r ON r.value = l.value WHERE r.value IS NULL
Why should
r.value = NULL
eliminate records ? I am not understanding . I know I am missing something very basic but at present I cant figure out even that basic one. I would appreciate if someone explains it to me in detail .I want a very basic explanation.
-
Strawberry about 9 yearsJust run this:
SELECT * FROM t_left l LEFT JOIN t_right r ON r.value = l.value
. Now it should be obvious!
-
-
pala_ about 9 yearsthis doesn't explain why there are nulls in the first place
-
Peter about 9 years@pala_ how so? you select from A
-
pala_ about 9 yearsyes, but from the question it seems some of the problem is understanding that nulls come from rows that dont match, because its a left join - as opposed to inner joining
-
paparazzo about 9 yearsThis statement is flat wrong "join table "b" (if matching rows exists - ON a.v=b.v)". You are describing an inner join.