Nested Query or Joins

18,162

Solution 1

I have heard joins should be preferred over nested queries. Is it true in general?

It depends on the requirements, and the data.

Using a JOIN risks duplicating the information in the resultset for the parent table if there are more than one child records related to it, because a JOIN returns the rows that match. Which means if you want unique values from the parent table while using JOINs, you need to look at using either DISTINCT or a GROUP BY clause. But none of this is a concern if a subquery is used.

Also, subqueries are not all the same. There's the straight evaluation, like your example:

where emp.id = (select s.id from sap s where s.id = 111)

...and the IN clause:

where emp.id IN (select s.id from sap s where s.id = 111)

...which will match any of the value(s) returned by the subquery when the straight evaluation will throw an error if s.id returns more than one value. But there's also the EXISTS clause...

WHERE EXISTS(SELECT NULL 
               FROM SAP s
              WHERE emp.id = s.id
                AND s.id = 111)

The EXISTS is different in that:

  • the SELECT clause doesn't get evaluated - you can change it to SELECT 1/0, which should trigger a divide-by-zero error but won't
  • it returns true/false; true based on the first instance the criteria is satisfied so it's faster when dealing with duplicates.
  • unlike the IN clause, EXISTS supports comparing two or more column comparisons at the same time, but some databases do support tuple comparison with the IN.
  • it's more readable

Solution 2

If the queries are logically equivalent, then the query optimizer should be able to make the same (best) execution plan from each one. In that case, query style should support what can be understood the best (that's subqueries for me).

Share:
18,162
ajay
Author by

ajay

Updated on June 04, 2022

Comments

  • ajay
    ajay about 2 years

    I have heard joins should be preferred over nested queries. Is it true in general? Or there might be scenarios where one would be faster than other:

    for e.g. which is more efficient way to write a query?:

    Select emp.salary 
    from employee emp    
    where emp.id = (select s.id from sap s where s.id = 111)
    

    OR

    Select emp.salary     
    from employee emp   
    INNER JOIN sap s ON emp.id = s.id   
    WHERE s.id = 111
    
    • Tim
      Tim over 13 years
      And how about if the where condition is made part of the join condition? INNER JOIN sap s on emp.id = s.id and s.id = 111
  • Kirk Woll
    Kirk Woll over 13 years
    But why? What's the difference in execution plan?
  • Explosion Pills
    Explosion Pills over 13 years
    No difference because MySQL will optimize it to probably use the JOIN and do the same thing, but you will see it running a PRIMARY query and a SUBQUERY. Not sure whether this impacts on a speed. Another thing is you are semantically joining the two tables, so why not write this?
  • Kirk Woll
    Kirk Woll over 13 years
    @tandu, I think for a great many queries, subqueries offer more clarity. If you only want one piece of data from the other table, it is far more clear for it to be all in one place (with a subquery) than to split it up into the select, from/join, and potentially the where clause of the primary query.
  • Explosion Pills
    Explosion Pills over 13 years
    Totally disagree. When you JOIN you are asking for the relational product of the two tables which is the rows upon which their index (indeces?) match. With a subquery you are saying, eh, give me all of the rows in table 1 that match this piece of data (an ID? probably one). Now, compare them to all of the values on this other (primary) table where the keys match.
  • Kirk Woll
    Kirk Woll over 13 years
    Wow. You are a rare bird if you think that subqueries are unilaterally less readable than a join.
  • Daniel Vassallo
    Daniel Vassallo over 13 years
    Do you think there are situations where the IN clause is better than the EXISTS?
  • OMG Ponies
    OMG Ponies over 13 years
    @Daniel Vassallo: Not at this moment. I wonder the EXISTS approach also circumvents the issue with a high number of values to compare against that IN has on some databases (in the thousands).