PostgreSQL statistical mode value

12,105

Solution 1

Since PostgreSQL 9.4 there is a built-in aggregate function mode. It is used like

SELECT mode() WITHIN GROUP (ORDER BY some_value) AS modal_value FROM tbl;

Read more about ordered-set aggregate functions here:

36.10.3. Ordered-Set Aggregates

Built-in Ordered-Set Aggregate Functions

See other answers for dealing with older versions of Postgres.

Solution 2

You can try something like:

SELECT int_value, count(*)
FROM t
GROUP BY int_value
ORDER BY count(*) DESC
LIMIT 1;

The idea behind it - you get the count for every int_value, then order them (so that the biggest count goes first), then LIMIT the query to first row only, to get the int_value with highest count only.

Solution 3

If you want to do it by groups:

select
    int_value * 10 / (select max(int_value) from t) g,
    min(int_value) "from",
    max(int_value) "to",
    count(*) total
from t
group by 1
order by 4 desc
Share:
12,105

Related videos on Youtube

Peter Krauss
Author by

Peter Krauss

Hello! I use PostgreSQL, PHP, Javascript, jQuery, HTML, XML, XSLT, and ... "Everybody stand back, I know regular expressions!" ─ xkcd 208 2015 consulting on the following areas, LexML (XML for law): see lexML.gov.br JATS (XML for Science): see NISO's Journal Article Tag Suite HTML+RDFa and Web Semantic ... Corporate Social Responsibility ...

Updated on September 15, 2022

Comments

  • Peter Krauss
    Peter Krauss over 1 year

    I am using the SQL query

        SELECT round(avg(int_value)) AS modal_value FROM t;
    

    to obtain modal value, that, of couse, not is correct, but is a first option to show some result.

    So, my question is, "How to do the thing right?".


    With PostgreSQL 8.3+ we can use this user-defined agregate to define mode:

    CREATE FUNCTION _final_mode(anyarray) RETURNS anyelement AS $f$
        SELECT a FROM unnest($1) a
        GROUP BY 1  ORDER BY COUNT(1) DESC, 1
        LIMIT 1;
    $f$ LANGUAGE 'sql' IMMUTABLE;
    CREATE AGGREGATE mode(anyelement) (
      SFUNC=array_append,  STYPE=anyarray,
      FINALFUNC=_final_mode, INITCOND='{}'
    );
    

    but, as an user-defined average, with big tables it can be slow (compare sum/count with buildin AVG function). With PostgreSQL 9+, there are no direct (buildin) function for calculate statistical mode value? Perhaps using pg_stats... How to do something like

        SELECT (most_common_vals(int_value))[1] AS modal_value FROM t;
    

    The pg_stats VIEW can be used for this kind of task (even once, by hand)?

  • Peter Krauss
    Peter Krauss almost 11 years
    Thanks @ClonaldoNeto, this is a good solution (!) for scalars and measures, to detect "modal intervals".
  • Peter Krauss
    Peter Krauss almost 11 years
    PS for reader: the first line (limit 1) is the mode; if you change 10 by 30 or by 100 you get more (and tiny) intervals; and to list intervals use "order by 1".
  • Peter Krauss
    Peter Krauss over 9 years
    Hello Bruno. I posted above a comment to IgorRomanchenko with this same wiki link... Check if our discution cover your answer (also the agregate function I was present above as _final_mode). The question is not about "how to reproduce the mode with SQL", but "where the PostgreSQL's build-in function" that do fast this kind of operarion.
  • Peter Krauss
    Peter Krauss about 4 years
    Hi @Luffydude, check your PostgreSQL version (!), and the cited link (you can copy/paste the function for old versions).