Join one to many and retrieve single result

sql postgresql select join one-to-many

30,655

Solution 1

Simpler, shorter, faster with PostgreSQL's DISTINCT ON:

SELECT DISTINCT ON (a.id)
       a.id, a.name, a.date, b.code1, b.code2
FROM   table_a a
LEFT   JOIN table_b b USING (id)
ORDER  BY a.id, b.sort

Details, explanation, benchmark and links in this closely related answer.
I use a LEFT JOIN, so that rows from table_a without any matching row in table_b are not dropped.

Side notes:

While being allowed in PostgreSQL, it's unwise to use date as column name. It's a reserved word in every SQL standard and a type name in PsotgreSQL.

It's also an anti-pattern to name an ID column id. Not descriptive and not helpful. One (of many) possible naming convention would be to name it after the table where it is primary key: table_a_id. Same name for foreign keys referencing it (if no other natural name takes precedence).

Solution 2

PostgreSQL supports window function. Try this,

SELECT d.ID, d.NAME, d.DATE, d.CODE1, d.CODE2
FROM
(
  SELECT  a.ID, a.NAME, a.DATE,
          b.CODE1, b.CODE2,
          ROW_NUMBER() OVER(PARTITION BY a.ID ORDER BY b.SORT ASC, b.CODE2 DESC) AS ROWNUM
  FROM    TableA a
          INNER JOIN TableB b
            ON a.ID = b.ID
) d
WHERE d.RowNum = 1

SQLFiddle Demo

Solution 3

Here's what I'd do on SQL Server.

SELECT a.ID,
    a.NAME,
    a.DATE,
    b.CODE1,
    b.CODE2
FROM TABLE_A a
JOIN TABLE_B b
    on a.ID = b.ID
WHERE b.SORT = (SELECT MIN(SORT) 
    FROM TABLE_B
    WHERE ID = b.ID)

30,655

Author by

thorgilsv

Programmer for the Icelandic Cancer Registry

Updated on September 18, 2020

Comments

thorgilsv over 3 years

I have two tables, in PostgreSQL if that matters, with one to many relations. I need to join them so that for each "one" I only get single result from the "many" table. Not only that but I need to single out specific results from the "many" table.

TABLE_A
ID  | NAME      | DATE          | MORE COLS....
1   | JOHN      | 2012-01-10    | ....
2   | LIZA      | 2012-01-10    | ....
3   | ANNY      | 2012-01-10    | ....
4   | JAMES     | 2012-01-10    | ....
...

TABLE_B
ID  | CODE1     | CODE2     | SORT
1   | 04020     | 85003     | 1
1   | 04030     | 85002     | 4
2   | 81000     | 80703     | 1
3   | 87010     | 80102     | 4
3   | 87010     | 84701     | 5
4   | 04810     | 85003     | 1
4   | 04030     | 85002     | 4
4   | 04020     | 85003     | 1
...

QUERY RESULT
ID  | NAME      | DATE          | CODE1     | CODE2
1   | JOHN      | 2012-01-10    | 04020     | 85003
2   | LIZA      | 2012-01-10    | 81000     | 80703
3   | ANNY      | 2012-01-10    | 87010     | 80102
4   | JAMES     | 2012-01-10    | 04810     | 85003
...

The SORT column in TABLE_B is actually the last char in CODE2 reordered. CODE2 can end with 1-9 but 3 is most important then 5, 7, 4, 2, 1, 0, 6, 8, 9 hence 3-->1, 5-->2, 7-->3 and so forth.

The problem I'm facing is that I need the row from TABLE_B where sort is the lowest number. In some cases there are multiple lowest case (see ID=4 in TABLE_B) then it doesn't matter which of the rows with lowest ID are selected, only that there is single result for that ID.