Conditional SQL count

40,210

Solution 1

In Postgres 9.4 or later, use the aggregate FILTER option. Typically cleanest and fastest:

SELECT category
     , count(*) FILTER (WHERE question1 = 0) AS zero
     , count(*) FILTER (WHERE question1 = 1) AS one
     , count(*) FILTER (WHERE question1 = 2) AS two
FROM   reviews
GROUP  BY 1;

Details for the FILTER clause:

If you want it short:

SELECT category
     , count(question1 = 0 OR NULL) AS zero
     , count(question1 = 1 OR NULL) AS one
     , count(question1 = 2 OR NULL) AS two
FROM   reviews
GROUP  BY 1;

More syntax variants:

Proper crosstab query

crosstab() yields the best performance and is shorter for long lists of options:

SELECT * FROM crosstab(
     'SELECT category, question1, count(*) AS ct
      FROM   reviews
      GROUP  BY 1, 2
      ORDER  BY 1, 2'
   , 'VALUES (0), (1), (2)'
   ) AS ct (category text, zero int, one int, two int);

Detailed explanation:

Solution 2

The "best" way (for me) is to write a query like:

SELECT
    category,
    question1,
    count(*)
FROM reviews
GROUP BY category, question1

Then I use this data to draw a table in application logic.

Other option is to use one JSON column for all grouping results. This will result in something like:

category1 | {"zero": 1, "one": 3, "two": 5}
category2 | {"one": 7, "two": 4}

and so on.

The query for this option you can build from the previous one with json_build_object and json_agg. The best thing for this option - you do not need to know number of possible question1 values ahead of time.

Share:
40,210

Related videos on Youtube

Erwin Brandstetter
Author by

Erwin Brandstetter

Database expert. Warmongers make peace before asking questions.

Updated on October 24, 2021

Comments

  • Erwin Brandstetter
    Erwin Brandstetter over 2 years

    What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column?
    My database is PostgreSQL.

    I have seen:

    SELECT
        sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
        sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
        sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
        category
    FROM reviews
        GROUP BY category
    

    where question1 can have a value of either 0, 1 or 2.

    I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)

    However, this becomes more cumbersome to write as the number of possible values for question1 increases. Is there a convenient way to write this query, possibly optimizing performance?

    • Gordon Linoff
      Gordon Linoff about 9 years
      Your version is quite reasonable, although you should include the end if you want it to be syntactically correct. An alternative is Postgres's crosstab functionality. I don't know which is better performance-wise.
    • Admin
      Admin about 9 years
      sorry, forgot to type the end
    • Giorgi Nakeuri
      Giorgi Nakeuri about 9 years
      why not to group by category and question1 and count rows in groups?
    • Ram
      Ram about 9 years
      @GiorgiNakeuri The OP wants columns names for each value of question1 then you need to transpose the result for the required result.
    • Giorgi Nakeuri
      Giorgi Nakeuri about 9 years
      @Ram, aha, ok I see know. In MSSQL it would be group by category and question and then pivoting grouping results.
    • Ram
      Ram about 9 years
      @GiorgiNakeuri True. Pivot is what I thought of first but this is postgresql.