Complex join with nested group-by/having clause?

39

Solution 1

How's this?

SELECT i.id,
       i.created_at
FROM   imports i
       INNER JOIN (SELECT   a.import_id
                   FROM     albums a
                            INNER JOIN songs s
                              ON a.id = s.album_id
                   GROUP BY a.id
                   HAVING   Count(* ) = 1) AS TEMP
         ON i.id = TEMP.import_id; 

In most database systems, the JOIN works a lost faster than doing a WHERE ... IN.

Solution 2

SELECT i.id, i.created_at, COUNT(s.album_id)
FROM imports AS i
    INNER JOIN albums AS a
        ON i.id = a.import_id
    INNER JOIN songs AS s
        ON a.id = s.album_id
GROUP BY i.id, i.created_at
HAVING COUNT(s.album_id) = 1

(You might not need to include the COUNT in the SELECT list itself. SQL Server doesn't require it, but it's possible that a different RDBMS might.)

Solution 3

Untested:

select
    i.id, i.created_at
from
    imports i
where
    exists (select *
       from
           albums a
           join
           songs s on a.id = s.album_id
       where
           a.import_id = i.id
       group by
           a.id
       having
           count(*) = 1)

OR

select
    i.id, i.created_at
from
    imports i
where
    exists (select *
       from
           albums a
           join
           songs s on a.id = s.album_id
       group by
           a.import_id, a.id
       having
           count(*) = 1 AND a.import_id = i.id)

Solution 4

All three sugested techniques should be faster than your WHERE IN:

  1. Exists with a related subquery (gbn)
  2. Subquery that is inner joined (achinda99)
  3. Inner Joining all three tables (luke)

(All should work, too ..., so +1 for all of them. Please let us know if one of them does not work!)

Which one actually turns out to be the fastest, depends on your data and the execution plan. But an interesting example of different ways for expressing the same thing in SQL.

Solution 5

I tried to make the entire query a single (no nesting) join but ran into problems with the group/having clauses.

You can join subquery using CTE (Common Table Expression) if you are using SQL Server version 2005/2008

As far as I know, CTE is simply an expression that works like a virtual view that works only one a single select statement - So you will be able to do the following. I usually find using CTE to improve query performance as well.

with AlbumSongs as (
    select  a.import_id 
    from    albums a inner join songs s on a.id = s.album_id
    group by a.id 
    having 1 = count(s.id)
)
select  i.id, i.created_at 
from    imports i 
        inner join AlbumSongs A on A.import_id = i.import_id
Share:
39
balasankar
Author by

balasankar

Updated on February 25, 2020

Comments

  • balasankar
    balasankar about 4 years

    Is it possible to construct a query in parts and run in gremlin python. some thing like this -

    q="hasLabel('foo')"
    m="has('type','goo')"
    g.V().q.m.values('ABC').toList()  
    

    instead of directly running

    g.V().hasLabel('foo').has('type','goo').values('ABC').toList()
    

    I tried this and i am getting - [] whereas it is producing results for

    g.V().hasLabel('foo').has('type','goo').values('ABC').toList()
    

    Is there any way to construct such a query?

  • Teflon Ted
    Teflon Ted about 15 years
    This was close enough. I had to add "group by i.id, i.created_at" in order to achieve the "no dupes" requirement (see original post). Thanks.