Nested Query or Joins
Solution 1
I have heard joins should be preferred over nested queries. Is it true in general?
It depends on the requirements, and the data.
Using a JOIN risks duplicating the information in the resultset for the parent table if there are more than one child records related to it, because a JOIN returns the rows that match. Which means if you want unique values from the parent table while using JOINs, you need to look at using either DISTINCT
or a GROUP BY
clause. But none of this is a concern if a subquery is used.
Also, subqueries are not all the same. There's the straight evaluation, like your example:
where emp.id = (select s.id from sap s where s.id = 111)
...and the IN clause:
where emp.id IN (select s.id from sap s where s.id = 111)
...which will match any of the value(s) returned by the subquery when the straight evaluation will throw an error if s.id
returns more than one value. But there's also the EXISTS
clause...
WHERE EXISTS(SELECT NULL
FROM SAP s
WHERE emp.id = s.id
AND s.id = 111)
The EXISTS is different in that:
- the SELECT clause doesn't get evaluated - you can change it to
SELECT 1/0
, which should trigger a divide-by-zero error but won't - it returns true/false; true based on the first instance the criteria is satisfied so it's faster when dealing with duplicates.
- unlike the IN clause, EXISTS supports comparing two or more column comparisons at the same time, but some databases do support tuple comparison with the IN.
- it's more readable
Solution 2
If the queries are logically equivalent, then the query optimizer should be able to make the same (best) execution plan from each one. In that case, query style should support what can be understood the best (that's subqueries for me).
ajay
Updated on June 04, 2022Comments
-
ajay about 2 years
I have heard joins should be preferred over nested queries. Is it true in general? Or there might be scenarios where one would be faster than other:
for e.g. which is more efficient way to write a query?:
Select emp.salary from employee emp where emp.id = (select s.id from sap s where s.id = 111)
OR
Select emp.salary from employee emp INNER JOIN sap s ON emp.id = s.id WHERE s.id = 111
-
Tim over 13 yearsAnd how about if the where condition is made part of the join condition? INNER JOIN sap s on emp.id = s.id and s.id = 111
-
-
Kirk Woll over 13 yearsBut why? What's the difference in execution plan?
-
Explosion Pills over 13 yearsNo difference because MySQL will optimize it to probably use the JOIN and do the same thing, but you will see it running a PRIMARY query and a SUBQUERY. Not sure whether this impacts on a speed. Another thing is you are semantically joining the two tables, so why not write this?
-
Kirk Woll over 13 years@tandu, I think for a great many queries, subqueries offer more clarity. If you only want one piece of data from the other table, it is far more clear for it to be all in one place (with a subquery) than to split it up into the select, from/join, and potentially the where clause of the primary query.
-
Explosion Pills over 13 yearsTotally disagree. When you JOIN you are asking for the relational product of the two tables which is the rows upon which their index (indeces?) match. With a subquery you are saying, eh, give me all of the rows in table 1 that match this piece of data (an ID? probably one). Now, compare them to all of the values on this other (primary) table where the keys match.
-
Kirk Woll over 13 yearsWow. You are a rare bird if you think that subqueries are unilaterally less readable than a join.
-
Daniel Vassallo over 13 yearsDo you think there are situations where the
IN
clause is better than theEXISTS
? -
OMG Ponies over 13 years@Daniel Vassallo: Not at this moment. I wonder the EXISTS approach also circumvents the issue with a high number of values to compare against that IN has on some databases (in the thousands).