PostgreSQL statistical mode value
Solution 1
Since PostgreSQL 9.4 there is a built-in aggregate function mode
. It is used like
SELECT mode() WITHIN GROUP (ORDER BY some_value) AS modal_value FROM tbl;
Read more about ordered-set aggregate functions here:
36.10.3. Ordered-Set Aggregates
Built-in Ordered-Set Aggregate Functions
See other answers for dealing with older versions of Postgres.
Solution 2
You can try something like:
SELECT int_value, count(*)
FROM t
GROUP BY int_value
ORDER BY count(*) DESC
LIMIT 1;
The idea behind it - you get the count for every int_value
, then order them (so that the biggest count
goes first), then LIMIT
the query to first row only, to get the int_value
with highest count only.
Solution 3
If you want to do it by groups:
select
int_value * 10 / (select max(int_value) from t) g,
min(int_value) "from",
max(int_value) "to",
count(*) total
from t
group by 1
order by 4 desc
Related videos on Youtube
Peter Krauss
Hello! I use PostgreSQL, PHP, Javascript, jQuery, HTML, XML, XSLT, and ... "Everybody stand back, I know regular expressions!" ─ xkcd 208 2015 consulting on the following areas, LexML (XML for law): see lexML.gov.br JATS (XML for Science): see NISO's Journal Article Tag Suite HTML+RDFa and Web Semantic ... Corporate Social Responsibility ...
Updated on September 15, 2022Comments
-
Peter Krauss over 1 year
I am using the SQL query
SELECT round(avg(int_value)) AS modal_value FROM t;
to obtain modal value, that, of couse, not is correct, but is a first option to show some result.
So, my question is, "How to do the thing right?".
With PostgreSQL 8.3+ we can use this user-defined agregate to define mode:
CREATE FUNCTION _final_mode(anyarray) RETURNS anyelement AS $f$ SELECT a FROM unnest($1) a GROUP BY 1 ORDER BY COUNT(1) DESC, 1 LIMIT 1; $f$ LANGUAGE 'sql' IMMUTABLE; CREATE AGGREGATE mode(anyelement) ( SFUNC=array_append, STYPE=anyarray, FINALFUNC=_final_mode, INITCOND='{}' );
but, as an user-defined average, with big tables it can be slow (compare sum/count with buildin AVG function). With PostgreSQL 9+, there are no direct (buildin) function for calculate statistical mode value? Perhaps using
pg_stats
... How to do something likeSELECT (most_common_vals(int_value))[1] AS modal_value FROM t;
The pg_stats VIEW can be used for this kind of task (even once, by hand)?
-
Peter Krauss almost 11 yearsThanks @ClonaldoNeto, this is a good solution (!) for scalars and measures, to detect "modal intervals".
-
Peter Krauss almost 11 yearsPS for reader: the first line (limit 1) is the mode; if you change 10 by 30 or by 100 you get more (and tiny) intervals; and to list intervals use "order by 1".
-
Peter Krauss over 9 yearsHello Bruno. I posted above a comment to IgorRomanchenko with this same wiki link... Check if our discution cover your answer (also the agregate function I was present above as
_final_mode
). The question is not about "how to reproduce the mode with SQL", but "where the PostgreSQL's build-in function" that do fast this kind of operarion. -
Peter Krauss about 4 yearsHi @Luffydude, check your PostgreSQL version (!), and the cited link (you can copy/paste the function for old versions).