Use two aggregate functions in the same query
Solution 1
Firstly you might want to read this article about reasons to use ANSI 92 Joins instead of the aged ANSI 89 as used above. Secondly, SQLLite does support the GROUP_CONCAT function so you can use this.
You just neeed to add your second query as subquery into the first to get the desired result:
SELECT Talks.TalkID,
Talks.Title,
ts.Speakers,
COUNT(*) AS SimilarTalks
FROM Talks
INNER JOIN SimilarTalks
ON Talks.TalkID = SimilarTalks.SimilarTo
INNER JOIN Talks t2
ON SimilarTalks.TalkID = t2.TalkID
AND t2.Starred = 1
INNER JOIN
( SELECT TalkID, GROUP_CONCAT(Speaker, ',') AS Speakers
FROM TalkSpeaker
GROUP BY TalkID
) ts
ON ts.TalkID = Talks.TalkID
WHERE Talks.Starred = 0
GROUP BY Talks.TalkID, Talks.Title, ts.Speakers
ORDER BY COUNT(*) DESC
LIMIT 2;
EDIT
You could also do this without a subquery using DISTINCT
:
SELECT Talks.TalkID,
Talks.Title,
GROUP_CONCAT(DISTINCT ts.Speaker) AS Speakers,
COUNT(DISTINCT t2.TalkID) AS SimilarTalks
FROM Talks
INNER JOIN SimilarTalks
ON Talks.TalkID = SimilarTalks.SimilarTo
INNER JOIN Talks t2
ON SimilarTalks.TalkID = t2.TalkID
AND t2.Starred = 1
INNER JOIN TalkSpeaker ts
ON ts.TalkID = Talks.TalkID
WHERE Talks.Starred = 0
GROUP BY Talks.TalkID, Talks.Title
ORDER BY COUNT(DISTINCT t2.TalkID) DESC
LIMIT 2;
However I see no benefit at all in this method, and it is likely to be less efficient (I have not tested so can't be certain)
Solution 2
First, to get just the IDs of the desired talks, remove the other fields from your first query:
SELECT unstarred.talkID
FROM talks AS starred
JOIN similarTalks AS s ON starred.talkID = s.talkID
JOIN talks AS unstarred ON s.similarTo = unstarred.talkID
WHERE starred.starred
AND NOT unstarred.starred
GROUP BY unstarred.talkID
ORDER BY COUNT(*) DESC
LIMIT 2
Then, use this as a subquery to get the information about the desired talks:
SELECT t.title AS Title,
group_concat(s.speaker, ', ') AS Speakers
FROM talks AS t JOIN talkspeaker AS s ON t.talkID = s.talkID
WHERE t.talkID IN (SELECT unstarred.talkID
FROM talks AS starred
JOIN similarTalks AS s ON starred.talkID = s.talkID
JOIN talks AS unstarred ON s.similarTo = unstarred.talkID
WHERE starred.starred
AND NOT unstarred.starred
GROUP BY unstarred.talkID
ORDER BY COUNT(*) DESC
LIMIT 2)
GROUP BY t.talkID
Samik R
Updated on June 11, 2022Comments
-
Samik R almost 2 years
Consider the following tables:
[Table: talks] talkID | title | starred -------+--------------+-------- 1 | talk1-title | 1 2 | talk2-title | 1 3 | talk3-title | 0 4 | talk4-title | 0 5 | talk5-title | 0 [Table: talkspeaker] talkID | speaker -------+--------- 1 | Speaker1 1 | Speaker2 2 | Speaker3 3 | Speaker4 3 | Speaker5 4 | Speaker6 5 | Speaker7 5 | Speaker8 [Table: similartalks] talkID | similarTo -------+---------- 1 | 3 1 | 4 2 | 3 2 | 4 2 | 5 3 | 2 4 | 5 5 | 3 5 | 4
What I want to do is: Given the set of starred talks, I would like to select the top 2 of the unstarred talks (starred = 0) and their titles and speakers that are most similar to the set of starred talks. The problem is that getting the speakers requires using an aggregate function, and so does getting the most similar talks.
Without the speakers in the fray, I have been able to get the most similar talks using the following query:
select t2.talkID, t2.title, count(*) as count from similarTalks s, talks t1, talks t2 where s.talkID = t1.talkID and t1.Starred = 1 and s.similarTo = t2.TalkID and t2.Starred = 0 group by t2.title, t2.talkID order by count desc limit 2
Generally, I use the following aggregate function for getting the speakers, with appropriate group by columns (assume t = talkspeaker):
group_concat(t.speaker, ', ') as Speakers
as in
select t1.title, group_concat(t2.speaker, ', ') as Speakers from talks t1, talkspeaker t2 where t1.talkID = t2.talkID group by t1.title
But I am not able to combine the two things together. It might matter that I am planning to run this query in a sqlite database (that is where the group_concat function comes from). The answer to the top 2 unstarred talks most similar to starred talks seem to be with talkIDs 3 and 4.
-
Samik R over 11 yearsThanks for pointing me out to the SQL92 syntax. I use SQL on and off, so wasn't really aware of this new syntax. Also thanks for introducing me to SQL fiddle - seems like a useful service. In my tests, the second query seem to consistently run faster (presumably because of no subquery, but not sure), but is returning different (and incorrect) result than the first one.
-
Samik R over 11 yearsThanks for the response. This seem to take around the same time as the first query in @GarethD's response. I like having the count (of similar talks) information outside though.
-
Samik R over 11 yearsSince I did not hear back, I ended up using the first query here. Thanks again - and if you do have some time, I will appreciate if you can comment on the second query.
-
GarethD over 11 yearsHaha, I spent about 20 minutes looking at this yesterday, and just spent another 20, breaking down the second query because I could not for the life of me work out why it wasn't working properly. Both queries were missing
DESC
after theORDER BY
, but the SQL Fiddle wasn't. I could not see the wood for the trees... I've updated the answer. -
Samik R over 11 yearsI so much appreciate taking time to investigate this. I will take a look.