PostgreSQL: How to SUM all attributes in a JSONB field?
Solution 1
The query should do the job:
select date, json_object_agg(key, val)
from (
select date, key, sum(value::numeric) val
from mytable t, jsonb_each_text(star_pu)
group by date, key
) s
group by date;
The resulting json values will be sorted alphabetically by keys (a side effect of json_object_agg ()
). I do not know whether this matters.
Solution 2
I've written a Postgres extension that does exactly that. Once you have it installed you could do:
SELECT jsonb_deep_sum(star_pu) FROM mytable;
Benchmarks are in 4s for 2 million rows, @klin's answer takes 11s
Solution 3
There might be a better way, but at least this works:
WITH
keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable),
sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key)
SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums
Basically it explodes the jsonbs into rows, gets the names from them, sums them up, aggregates into arrays and creates a jsonb structure. Unfortunately there isn't a jsonb_object()
function so we have to make it into json and then cast to jsonb.
Richard
Updated on June 12, 2022Comments
-
Richard almost 2 years
I am working with Postgres 9.4. I have a JSONB field:
Column │ Type │ Modifiers ─────────────────┼──────────────────────┼──────────────────────────────────────────────────────────────────── id │ integer │ not null default practice_id │ character varying(6) │ not null date │ date │ not null pct_id │ character varying(3) │ astro_pu_items │ double precision │ not null astro_pu_cost │ double precision │ not null star_pu │ jsonb │
I can query the raw values of the JSONB field just fine:
SELECT star_pu FROM mytable limit 1; star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877}
Now I want
SUM
the JSONB values across the entire table, but I don't know how to do this. Ideally I would get back a dictionary, where the keys are as above and the values are summed values.I can do the following to
SUM
one JSONB field explicitly:SELECT date, SUM(total_list_size) as total_list_size, SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items FROM mytable GROUP BY date ORDER BY date
But how do I calculate the sums for all the attributes in the JSONB field - and preferably return the entire field as a dictionary? Ideally I'd get back something like:
star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ...
I guess I can get each field manually by SUMming each key explicitly, but the whole reason I have the JSONB field is that there are lots of keys and they may change.
It is safe to assume that the JSONB field only contains keys and values, i.e. has depth 1.