Conditional SQL count
Solution 1
In Postgres 9.4 or later, use the aggregate FILTER
option. Typically cleanest and fastest:
SELECT category
, count(*) FILTER (WHERE question1 = 0) AS zero
, count(*) FILTER (WHERE question1 = 1) AS one
, count(*) FILTER (WHERE question1 = 2) AS two
FROM reviews
GROUP BY 1;
Details for the FILTER
clause:
If you want it short:
SELECT category
, count(question1 = 0 OR NULL) AS zero
, count(question1 = 1 OR NULL) AS one
, count(question1 = 2 OR NULL) AS two
FROM reviews
GROUP BY 1;
More syntax variants:
Proper crosstab query
crosstab()
yields the best performance and is shorter for long lists of options:
SELECT * FROM crosstab(
'SELECT category, question1, count(*) AS ct
FROM reviews
GROUP BY 1, 2
ORDER BY 1, 2'
, 'VALUES (0), (1), (2)'
) AS ct (category text, zero int, one int, two int);
Detailed explanation:
Solution 2
The "best" way (for me) is to write a query like:
SELECT
category,
question1,
count(*)
FROM reviews
GROUP BY category, question1
Then I use this data to draw a table in application logic.
Other option is to use one JSON column for all grouping results. This will result in something like:
category1 | {"zero": 1, "one": 3, "two": 5}
category2 | {"one": 7, "two": 4}
and so on.
The query for this option you can build from the previous one with json_build_object
and json_agg
. The best thing for this option - you do not need to know number of possible question1
values ahead of time.
Related videos on Youtube
Erwin Brandstetter
Database expert. Warmongers make peace before asking questions.
Updated on October 24, 2021Comments
-
Erwin Brandstetter over 2 years
What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column?
My database is PostgreSQL.I have seen:
SELECT sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO, sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE, sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO, category FROM reviews GROUP BY category
where
question1
can have a value of either 0, 1 or 2.I have also seen a version of that using
count(CASE WHEN question1 = 0 THEN 1)
However, this becomes more cumbersome to write as the number of possible values for
question1
increases. Is there a convenient way to write this query, possibly optimizing performance?-
Gordon Linoff about 9 yearsYour version is quite reasonable, although you should include the
end
if you want it to be syntactically correct. An alternative is Postgres'scrosstab
functionality. I don't know which is better performance-wise. -
Admin about 9 yearssorry, forgot to type the end
-
Giorgi Nakeuri about 9 yearswhy not to group by category and question1 and count rows in groups?
-
Ram about 9 years@GiorgiNakeuri The OP wants columns names for each value of question1 then you need to transpose the result for the required result.
-
Giorgi Nakeuri about 9 years@Ram, aha, ok I see know. In MSSQL it would be group by category and question and then pivoting grouping results.
-
Ram about 9 years@GiorgiNakeuri True. Pivot is what I thought of first but this is postgresql.
-