Aggregate by aggregate (ARRAY_AGG)?
Solution 1
I'm not sure if you can aggregate by an array. If you can here is one approach:
select col1, array_agg(col2), ar
from (SELECT a1.column_1 as col1, a2.column_1 as col2,
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2) as ar
FROM agg_test a1 JOIN
agg_test a2
ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
) t
group by col1, ar
The alternative is to use array_dims
to convert the array values into a string.
Solution 2
You could also try something like this:
SELECT DISTINCT
a1.column_1,
ARRAY_AGG(a2.column_1) OVER (
PARTITION BY
a1.column_1,
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
) AS "a2.column_1 agg",
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
FROM agg_test a1
JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1 a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
;
(Highlighted are the parts that are different from the query you've posted in your question.)
The above uses a window ARRAY_AGG
to combine the values of a2.column_1
alongside the other other ARRAY_AGG
, using the latter's result as one of the partitioning criteria. Without the DISTINCT
, it would produce two {4,5}
rows for your example. So, DISTINCT
is needed to eliminate the duplicates.
Here's a SQL Fiddle demo: http://sqlfiddle.com/#!1/df5c3/4
Note, though, that the window ARRAY_AGG
cannot have an ORDER BY
like it's "normal" counterpart. That means the order of a2.column_1
values in the list would be indeterminate, although in the linked demo it does happen to match the one in your expected output.
Przemek
php, javascript / jQuery developer ios / objective-c in near future ;)
Updated on June 17, 2022Comments
-
Przemek about 2 years
Let's say I have a simple table
agg_test
with 3 columns -id
,column_1
andcolumn_2
. Dataset, for example:id|column_1|column_2 -------------------- 1| 1| 1 2| 1| 2 3| 1| 3 4| 1| 4 5| 2| 1 6| 3| 2 7| 4| 3 8| 4| 4 9| 5| 3 10| 5| 4
A query like this (with self join):
SELECT a1.column_1, a2.column_1, ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2) FROM agg_test a1 JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1 WHERE a1.column_1 = 1 GROUP BY a1.column_1, a2.column_1
Will produce a result like this:
column_1|column_1|array_agg --------------------------- 1| 2| {1} 1| 3| {2} 1| 4| {3,4} 1| 5| {3,4}
We can see that for values 4 and 5 from the joined table we have the same result in the last column. So, is it possible to somehow group the results by it, e.g:
column_1|column_1|array_agg --------------------------- 1| {2}| {1} 1| {3}| {2} 1| {4,5}| {3,4}
Thanks for any answers. If anything isn't clear or can be presented in a better way - tell me in the comments and I'll try to make this question as readable as I can.