How to use actual row count (COUNT(*)) in WHERE clause without writing the same query as subquery?
Solution 1
In MySQL, you can only do what you tried:
SELECT id, fruit, pip
FROM plant
WHERE (
SELECT COUNT(*)
FROM plant
) = 2;
or this variation:
SELECT id, fruit, pip
FROM plant
JOIN
(
SELECT COUNT(*) AS cnt
FROM plant
) AS c
ON c.cnt = 2;
Whether the 1st or the 2nd is more efficient, depends on the version of MySQL (and the optimizer). I would bet on the 2nd one, on most versions.
In other DBMSs, that have window functions, you can also do the first query that @Andomar suggests.
Here is a suggestion to avoid the bottleneck of calculating the derived table twice, once to get the rows and once more to get the count. If the derived table is expensive to be calculated, and its rows are thousands or millions, calculating them twice only to throw them away, is a problem, indeed. This may improve efficiency as it will limit the intermediately (twice) calculated rows to 3:
SELECT p.*
FROM
( SELECT id, fruit, pip
FROM plant
LIMIT 3
) AS p
JOIN
( SELECT COUNT(*) AS cnt
FROM
( SELECT 1
FROM plant
LIMIT 3
) AS tmp
) AS c
ON c.cnt = 2 ;
Solution 2
After re-reading your question, you're trying to return rows only if there are 2 rows in the entire table. In that case I think your own example query is already the best.
On another DBMS, you could use a Windowing function:
select *
from (
select *
, count(*) over () as cnt
from plant
) as SubQueryAlias
where cnt = 2
But the over
clause is not supported on MySQL.
old wrong anser below
The where
clause works before grouping. It works on single rows, not groups of rows, so you can't use aggregates like count
or max
in the where
clause.
To set filters that work on groups of rows, use the having
clause. It works after grouping and can be used to filter with aggregates:
SELECT id, fruit, pip
FROM plant
GROUP BY
id, fruit, pip
HAVING COUNT(*) = 2;
Solution 3
The other answers do not fulfill the original question which was to filter the results "without using a subquery".
You can actually do this by using a variable in 2 consecutive MySQL statements:
SET @count=0;
SELECT * FROM
(
SELECT id, fruit, pip, @count:=@count+1 AS count
FROM plant
WHERE
) tmp
WHERE @count = 2;
nawfal
You should accept the answer if it helped you by selecting the green tick mark on the top left of the answer. That's the encouraged practice at SO. Not only it helps future visitors, it encourages users to answer your questions. You should post what you have tried and tell us where you're stuck. That gives us more detail, and we can quickly give an answer by copy-pasting the code with minor modifications. Questions like I-need-this,-give-me-code doesnt work in SO. In the current format the question will be closed. You may read http://stackoverflow.com/faq additionally before posting. Do a good search on SO to ensure you're not asking a duplicated question, or else your question will be closed. Quick tag search
Updated on July 29, 2022Comments
-
nawfal almost 2 years
I have something like this:
SELECT id, fruit, pip FROM plant WHERE COUNT(*) = 2;
This weird query is self explanatory I guess.
COUNT(*)
here means the number of rows inplant
table. My requirement is that I need to retrieve values from specified fields only if total number of rows in table = 2. This doesn't work but:invalid use of aggregate function COUNT
.I cannot do this:
SELECT COUNT(*) as cnt, id, fruit, pip FROM plant WHERE cnt = 2;
for one, it limits the number of rows outputted to 1, and two, it gives the same error:
invalid use of aggregate function
.What I can do is instead:
SELECT id, fruit, pip FROM plant WHERE ( SELECT COUNT(*) FROM plant ) = 2;
But then that subquery is the main query re-run. I'm presenting here a small example of the larger part of the problem, though I know an additional
COUNT(*)
subquery in the given example isn't that big an overhead.Edit: I do not know why the question is downvoted. The
COUNT(*)
I'm trying to get is from a view (a temporary table) in the query which is a large query with 5 to 6 joins and additional where clauses. To re-run the query as a subquery to get the count is inefficient, and I can see the bottleneck as well.Here is the actual query:
SELECT U.UserName, E.Title, AE.Mode, AE.AttemptNo, IF(AE.Completed = 1, 'Completed', 'Incomplete'), ( SELECT COUNT(DISTINCT(FK_QId)) FROM attempt_question AS AQ WHERE FK_ExcAttemptId = @excAttemptId ) AS Inst_Count, ( SELECT COUNT(DISTINCT(AQ.FK_QId)) FROM attempt_question AS AQ JOIN `question` AS Q ON Q.PK_Id = AQ.FK_QId LEFT JOIN actions AS A ON A.FK_QId = AQ.FK_QId WHERE AQ.FK_ExcAttemptId = @excAttemptId AND ( Q.Type = @descQtn OR Q.Type = @actQtn AND A.type = 'CTVI.NotImplemented' AND A.IsDelete = @status AND ( SELECT COUNT(*) FROM actions WHERE FK_QId = A.FK_QId AND type != 'CTVI.NotImplemented' AND IsDelete = @status ) = 0 ) ) AS NotEvalInst_Count, ( SELECT COUNT(DISTINCT(FK_QId)) FROM attempt_question AS AQ WHERE FK_ExcAttemptId = @excAttemptId AND Mark = @mark ) AS CorrectAns_Count, E.AllottedTime, AE.TimeTaken FROM attempt_exercise AS AE JOIN ctvi_exercise_tblexercise AS E ON AE.FK_EId = E.PK_EId JOIN ctvi_user_table AS U ON AE.FK_UId = U.PK_Id JOIN ctvi_grade AS G ON AE.FK_GId = G.PK_GId WHERE AE.PK_Id = @excAttemptId -- AND COUNT(AE.*) = @number --the portion in contention.
Kindly ignore the above query and guide me to right direction from the small example query I posted, thanks.
-
nawfal over 11 yearsAndo, yes and hence I canceled my vote. let me see again to your edit
-
nawfal over 11 yearsAs I mentioned, making it a subquery is so inelegant and could be inefficient with 5 to 6 joins and where clauses already running on the main query.
-
Andomar over 11 yearsIn MYSQL, if you use
count
without agroup by
, MySQL groups on the empty set. It will return only one row. So AFAIK, you can't get at the total rowcount and more than one row in the same query. -
Bohemian over 11 years+1 for showing compassion in wading through the copious and largely irrelevant content in the question
-
ypercubeᵀᴹ over 11 yearsThe second query with
HAVING COUNT(*) = 2
is not going to return expected results. Ifid
is the primary key, allCOUNT(*)
values will be1
, whatever the number of rows in the table. -
nawfal over 11 years@Bohemian I was forced to add irrelevant content when I got a negative vote to make things clearer for a few. My initial question was compact and concise I believe :)
-
nawfal over 11 yearswhat's the role of limit 3 here?
-
ypercubeᵀᴹ over 11 yearsTo avoid retrieving twice the thousand (or millions) rows of the derived table. It doesn't matter to you if they are 3 or 3000 or 1000000, right? You don't want them if they are 3 or more.
-
nawfal over 11 yearsHaha, that's seems to be so innovative, I get it :)
-
ypercubeᵀᴹ over 11 years@nawfal: Can you share details about the timings of the different queries?
-
nawfal over 11 yearsI did some testing with few thousand records. They all performed more or less the same! If you want pinpoint accuracy after a number of testing, here is it: The last one you posted (with limits) performed marginally well with each query taking around 300ms. The first and second code blocks were ditto identical, but for a hair-breadth win sql1 beat sql2 by a score of 305 ms to 308 ms!!
-
nawfal over 8 yearsThis I believe will be inefficient since it will scan through the entire rows and at last check the @count against 2. Am I right?
-
Stack Man over 8 yearsI'm not sure about the efficiency costs but I think the best approach is to only be concerned with efficiency when you actually encounter bottlenecks. Until then, in my opinion, readability and maintainability has higher priority and having similar subqueries creates unnecessary complexity.