Join to only the "latest" record with t-sql

42,459

Solution 1

select a.*, bm.MaxRowDate
from (
    select TableAID, max(RowDate) as MaxRowDate
    from TableB
    group by TableAID
) bm
inner join TableA a on bm.TableAID = a.ID

If you need more columns from TableB, do this:

select a.*, b.* --use explicit columns rather than * here
from (
    select TableAID, max(RowDate) as MaxRowDate
    from TableB
    group by TableAID
) bm
inner join TableB b on bm.TableAID = b.TableAID
    and bm.MaxRowDate = b.RowDate
inner join TableA a on bm.TableAID = a.ID

Solution 2

SELECT *
FROM tableA A
OUTER APPLY (SELECT TOP 1 * 
             FROM tableB B
             WHERE A.ID = B.TableAID
             ORDER BY B.RowDate DESC) as B

Solution 3

table B join is optional: it depends if there are other columns you want

SELECT
    *
FROM
    tableA A
    JOIN
    tableB B ON A.ID = B.TableAID
    JOIN
    (
    SELECT Max(RowDate) AS MaxRowDate, TableAID
    FROM tableB
    GROUP BY TableAID
    ) foo ON B.TableAID = foo.TableAID AND B.RowDate= foo.MaxRowDate

Solution 4

With ABDateMap AS (
    SELECT Max(RowDate) AS LastDate, TableAID FROM TableB GROUP BY TableAID
),
LatestBRow As (
    SELECT MAX(ID) AS ID, TableAID FROM ABDateMap INNER JOIN TableB ON b.TableAID=a.ID AND b.RowDate = LastDate GROUP BY TableAID
)
SELECT columns
FROM TableA a
INNER JOIN LatestBRow m ON m.TableAID=a.ID
INNER JOIN TableB b on b.ID = m.ID

Solution 5

Just for the clarity's sake and to benefit those who will stumble upon this ancient question. The accepted answer would return duplicate rows if there are duplicate RowDate in Table B. A safer and more efficient way would be to utilize ROW_NUMBER():

Select a.*, b.* -- Use explicit column list rather than * here
From [Table A] a
Inner Join ( -- Use Left Join if the records missing from Table B are still required
    Select *,
        ROW_NUMBER() OVER (PARTITION BY TableAID ORDER BY RowDate DESC) As _RowNum
    From [Table B]
) b
On b.TableAID = a.ID
Where b._RowNum = 1
Share:
42,459

Related videos on Youtube

Joel Martinez
Author by

Joel Martinez

Software Engineer @ Xamarin (now a part of Microsoft). In the past, I founded the Orlando .NET User Group (ONETUG), wrote a few books, worked at a few startups, made a few games, and was a Microsoft XNA MVP for a few years. You can find me on twitter (@joelmartinez), or on the web at http://codecube.net

Updated on July 09, 2022

Comments

  • Joel Martinez
    Joel Martinez almost 2 years

    I've got two tables. Table "B" has a one to many relationship with Table "A", which means that there will be many records in table "B" for one record in table "A".

    The records in table "B" are mainly differentiated by a date, I need to produce a resultset that includes the record in table "A" joined with only the latest record in table "B". For illustration purpose, here's a sample schema:

    Table A
    -------
    ID
    
    Table B
    -------
    ID
    TableAID
    RowDate
    

    I'm having trouble formulating the query to give me the resultset I'm looking for any help would be greatly appreciated.

    • Paul Hadfield
      Paul Hadfield over 13 years
      In the case of two rows in table B having the same datetime, how will you define the single latest record? It is worth noting that DateTime is only good to 3ms (or something like that) whilst DateTime2 can measure down to nanoseconds (so is more accurate).
    • Lamak
      Lamak over 13 years
      The column to join [Table A] with [Table B] is TableAID?
    • Joel Martinez
      Joel Martinez over 13 years
      Should have included this ... you can assume that there won't be duplicate datetimes, so there will always be a "latest" :-)
  • Paul Hadfield
    Paul Hadfield over 13 years
    This does not quite answer the question fully as it only gets max row date from table b. The question has asked for the most recent row - so rest of table b needs to be returned for that row and also needs to take into account Table B having two entries of the same date/time for a TableA reference
  • D'Arcy Rittich
    D'Arcy Rittich over 13 years
    @Paul: I assumed TableB only has specified fields by OP. Have modified query to handle the case you mention. Duplicates may or may not be an issue for the user's data.
  • Joel Martinez
    Joel Martinez over 13 years
    Just curious ... why the avoidance of "*"? what are the draw backs to using it if I really want all of the columns in the resultset anyways?
  • D'Arcy Rittich
    D'Arcy Rittich over 13 years
    @Joel: avoiding * solves problems of: duplicate columns, assumptions about returned column order, performance penalty of selecting more data than you need, the list goes on.
  • Michael B.
    Michael B. about 9 years
    outer apply is much quicker !
  • Sonny Childs
    Sonny Childs about 8 years
    That outer apply is blazing fast compared to other approaches I tried on my system.
  • Maderas
    Maderas about 7 years
    I just tested the accepted answer vs this answer and I found that the outer apply was ~2 times slower than the accepted answer!
  • Daniel Lorenz
    Daniel Lorenz almost 7 years
    @Maderas You ran them both more than once during the testing, right? Otherwise, the data could have been cached.
  • Maderas
    Maderas almost 7 years
    @DanielLorenz I don't remember. I usually run them a minimum of 5 times each. I also tend to restart my dev servers before I test something new to find out how well it performs, but... i don't remember.