PostgreSQL: How to SUM all attributes in a JSONB field?

10,079

Solution 1

The query should do the job:

select date, json_object_agg(key, val)
from (
    select date, key, sum(value::numeric) val
    from mytable t, jsonb_each_text(star_pu)
    group by date, key
    ) s
group by date;

The resulting json values will be sorted alphabetically by keys (a side effect of json_object_agg ()). I do not know whether this matters.

Solution 2

I've written a Postgres extension that does exactly that. Once you have it installed you could do:

SELECT jsonb_deep_sum(star_pu) FROM mytable;

Benchmarks are in 4s for 2 million rows, @klin's answer takes 11s

Solution 3

There might be a better way, but at least this works:

WITH
  keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable),
  sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key)
  SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums

Basically it explodes the jsonbs into rows, gets the names from them, sums them up, aggregates into arrays and creates a jsonb structure. Unfortunately there isn't a jsonb_object() function so we have to make it into json and then cast to jsonb.

Share:
10,079
Richard
Author by

Richard

Updated on June 12, 2022

Comments

  • Richard
    Richard almost 2 years

    I am working with Postgres 9.4. I have a JSONB field:

         Column      │         Type         │                             Modifiers
    ─────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────
     id              │ integer              │ not null default
     practice_id     │ character varying(6) │ not null
     date            │ date                 │ not null
     pct_id          │ character varying(3) │
     astro_pu_items  │ double precision     │ not null
     astro_pu_cost   │ double precision     │ not null
     star_pu         │ jsonb                │
    

    I can query the raw values of the JSONB field just fine:

    SELECT star_pu FROM mytable limit 1;
    star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877}
    

    Now I want SUM the JSONB values across the entire table, but I don't know how to do this. Ideally I would get back a dictionary, where the keys are as above and the values are summed values.

    I can do the following to SUM one JSONB field explicitly:

        SELECT date, SUM(total_list_size) as total_list_size, 
        SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items
        FROM mytable GROUP BY date ORDER BY date
    

    But how do I calculate the sums for all the attributes in the JSONB field - and preferably return the entire field as a dictionary? Ideally I'd get back something like:

    star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ...
    

    I guess I can get each field manually by SUMming each key explicitly, but the whole reason I have the JSONB field is that there are lots of keys and they may change.

    It is safe to assume that the JSONB field only contains keys and values, i.e. has depth 1.