Count distinct values in MySQL without folding NULL values
Solution 1
Use NOT EXISTS
in a WHERE
clause:
SELECT count(*)
FROM a
WHERE NOT EXISTS (SELECT * FROM a a2 WHERE a2.f = a.f AND a2.id < a.id);
This way you can also get actual rows - if you need more than the bare count:
SELECT *
FROM a
WHERE NOT EXISTS (SELECT * FROM a a2 WHERE a2.f = a.f AND a2.id < a.id)
The =
operator makes sure that all rows with f IS NULL
are included. You had that in your query already.
Neither of these would work:
SELECT DISTINCT f FROM a;
SELECT * FROM a GROUP BY f;
.. because both would also fold NULL
values, and you want
every record WHERE f IS NULL.
Solution 2
There's a pretty simple approach that might work for you:
select count(distinct ifnull(f, id))
from a
Note that this query assumes that f values are never id values, and based on sample data and experience this is reasonable.
Edited:
I thought about it and there's an even simpler approach:
select count(distinct f) + sum(f is null) from a;
which you can see running on sqlfiddle
This works because distinct
throws away nulls, and sum(condition)
counts the number of times condition is true because in mysql true is 1
and false is 0
.
package
Updated on June 09, 2022Comments
-
package almost 2 years
I have table
a
with two fields:id
(PK) andf
.Consider following records:
id | f 1 | NULL 2 | 'foo' 3 | 'bar' 4 | NULL 5 | 'foo' 6 | 'baz'
I want to retrieve and count all the records having distinct
f
values including every recordWHERE f IS NULL
. Given this criteria, the query should return every record except #5, because the same value is already included in the set, and the total count would be 5.The query I'm using to retrieve all records looks like this:
SELECT CASE WHEN EXISTS (SELECT id FROM a a2 WHERE a2.f = a.f AND a.id < a2.id) THEN 1 END AS not_distinct FROM a HAVING not_distinct IS NULL
If this query could be improved, I'd welcome any feedback. Anyway, the main problem is counting. Obviously adding a
COUNT(*)
will not help here and I'm totally lost how to count the records after the filtering.