Finding mean, standard deviation ,percentiles for all numeric variables in table

13,435

Solution 1

You can simplify the logic using a lateral join:

select which, min(val), max(val), stddev(val), avg(val)
from t, lateral
     (values ('col1', col1), ('col2', col2), . . . 
     ) v(which, val)
group by which;

You still have to list the columns, but you only need to do so once in the values clause.

Solution 2

Dynamic SQL is a little bit trick in Greenplum.

Here is an example based on the instruction from https://www.pivotalguru.com/?p=266

$ psql postgres -c "create table foo (date date, c1 int, c2 int, c3 int);"
$ cat <<EOT >> /tmp/bar.sql
> select 'select ';
> select ' avg('  || attname || '), stddev(' || attname || '),' from pg_attribute
> where attrelid = 'foo'::regclass::oid and attnum > 0 and attname != 'date';
> select ' date from foo group by date;';
> EOT
$ psql -A -t  -f /tmp/foo.sql postgres | psql -a postgres
select 
 avg(c1), stddev(c1),
 avg(c2), stddev(c2),
 avg(c3), stddev(c3),
 date from foo group by date;
 avg | stddev | avg | stddev | avg | stddev | date 
-----+--------+-----+--------+-----+--------+------
Share:
13,435

Related videos on Youtube

user8545255
Author by

user8545255

Updated on June 04, 2022

Comments

  • user8545255
    user8545255 almost 2 years

    I have 30 numeric numeric columns in a table .I want to find mean,std, percentiles for all the columns in table.I don't want to write all the column names manually like below

    select date,
          avg(col1), stddev(col1),
          avg(col2), stddev(col2), 
    from table name group by date;
    

    Is there any way to find mean, std, percentiles for all the columns at once.

    • Tim Biegeleisen
      Tim Biegeleisen almost 6 years
      This answers most of your question: stackoverflow.com/questions/14316562/… ... not sure about STDEV.
    • Mankind_008
      Mankind_008 almost 6 years
      Others you will find here, [aggregate functions]:(postgresql.org/docs/9.1/static/functions-aggrega‌​te.html)
    • Haleemur Ali
      Haleemur Ali almost 6 years
      you have to write the column names and the aggregate functions explicitly. no way around that in sql.
    • Sung Yu-wei
      Sung Yu-wei almost 6 years
      apache madlib supports postgresql. summary module should cover all statistics you want. madlib.apache.org/docs/latest/group__grp__summary.html
    • a_horse_with_no_name
      a_horse_with_no_name almost 6 years
      Greenplum or Postgres? They are very different (even though they share some common roots)
    • Sung Yu-wei
      Sung Yu-wei almost 6 years
      apache madlib supports both postgresql and greenplum