Join to only the "latest" record with t-sql
Solution 1
select a.*, bm.MaxRowDate
from (
select TableAID, max(RowDate) as MaxRowDate
from TableB
group by TableAID
) bm
inner join TableA a on bm.TableAID = a.ID
If you need more columns from TableB, do this:
select a.*, b.* --use explicit columns rather than * here
from (
select TableAID, max(RowDate) as MaxRowDate
from TableB
group by TableAID
) bm
inner join TableB b on bm.TableAID = b.TableAID
and bm.MaxRowDate = b.RowDate
inner join TableA a on bm.TableAID = a.ID
Solution 2
SELECT *
FROM tableA A
OUTER APPLY (SELECT TOP 1 *
FROM tableB B
WHERE A.ID = B.TableAID
ORDER BY B.RowDate DESC) as B
Solution 3
table B join is optional: it depends if there are other columns you want
SELECT
*
FROM
tableA A
JOIN
tableB B ON A.ID = B.TableAID
JOIN
(
SELECT Max(RowDate) AS MaxRowDate, TableAID
FROM tableB
GROUP BY TableAID
) foo ON B.TableAID = foo.TableAID AND B.RowDate= foo.MaxRowDate
Solution 4
With ABDateMap AS (
SELECT Max(RowDate) AS LastDate, TableAID FROM TableB GROUP BY TableAID
),
LatestBRow As (
SELECT MAX(ID) AS ID, TableAID FROM ABDateMap INNER JOIN TableB ON b.TableAID=a.ID AND b.RowDate = LastDate GROUP BY TableAID
)
SELECT columns
FROM TableA a
INNER JOIN LatestBRow m ON m.TableAID=a.ID
INNER JOIN TableB b on b.ID = m.ID
Solution 5
Just for the clarity's sake and to benefit those who will stumble upon this ancient question. The accepted answer would return duplicate rows if there are duplicate RowDate
in Table B
. A safer and more efficient way would be to utilize ROW_NUMBER()
:
Select a.*, b.* -- Use explicit column list rather than * here
From [Table A] a
Inner Join ( -- Use Left Join if the records missing from Table B are still required
Select *,
ROW_NUMBER() OVER (PARTITION BY TableAID ORDER BY RowDate DESC) As _RowNum
From [Table B]
) b
On b.TableAID = a.ID
Where b._RowNum = 1
Related videos on Youtube
Joel Martinez
Software Engineer @ Xamarin (now a part of Microsoft). In the past, I founded the Orlando .NET User Group (ONETUG), wrote a few books, worked at a few startups, made a few games, and was a Microsoft XNA MVP for a few years. You can find me on twitter (@joelmartinez), or on the web at http://codecube.net
Updated on July 09, 2022Comments
-
Joel Martinez almost 2 years
I've got two tables. Table "B" has a one to many relationship with Table "A", which means that there will be many records in table "B" for one record in table "A".
The records in table "B" are mainly differentiated by a date, I need to produce a resultset that includes the record in table "A" joined with only the latest record in table "B". For illustration purpose, here's a sample schema:
Table A ------- ID Table B ------- ID TableAID RowDate
I'm having trouble formulating the query to give me the resultset I'm looking for any help would be greatly appreciated.
-
Paul Hadfield over 13 yearsIn the case of two rows in table B having the same datetime, how will you define the single latest record? It is worth noting that DateTime is only good to 3ms (or something like that) whilst DateTime2 can measure down to nanoseconds (so is more accurate).
-
Lamak over 13 yearsThe column to join [Table A] with [Table B] is TableAID?
-
Joel Martinez over 13 yearsShould have included this ... you can assume that there won't be duplicate datetimes, so there will always be a "latest" :-)
-
-
Paul Hadfield over 13 yearsThis does not quite answer the question fully as it only gets max row date from table b. The question has asked for the most recent row - so rest of table b needs to be returned for that row and also needs to take into account Table B having two entries of the same date/time for a TableA reference
-
D'Arcy Rittich over 13 years@Paul: I assumed TableB only has specified fields by OP. Have modified query to handle the case you mention. Duplicates may or may not be an issue for the user's data.
-
Joel Martinez over 13 yearsJust curious ... why the avoidance of "*"? what are the draw backs to using it if I really want all of the columns in the resultset anyways?
-
D'Arcy Rittich over 13 years@Joel: avoiding * solves problems of: duplicate columns, assumptions about returned column order, performance penalty of selecting more data than you need, the list goes on.
-
Michael B. about 9 yearsouter apply is much quicker !
-
Sonny Childs about 8 yearsThat outer apply is blazing fast compared to other approaches I tried on my system.
-
Maderas about 7 yearsI just tested the accepted answer vs this answer and I found that the outer apply was ~2 times slower than the accepted answer!
-
Daniel Lorenz almost 7 years@Maderas You ran them both more than once during the testing, right? Otherwise, the data could have been cached.
-
Maderas almost 7 years@DanielLorenz I don't remember. I usually run them a minimum of 5 times each. I also tend to restart my dev servers before I test something new to find out how well it performs, but... i don't remember.