T-SQL - SELECT by nearest date and GROUPED BY ID

sql sql-server sql-server-2005 tsql

12,617

Solution 1

you can try this.

DECLARE @Date DATE = '10/01/2010';

WITH cte AS
    (
    SELECT ID, LinkedID, ABS(DATEDIFF(DD, @date, DATE)) diff,
        ROW_NUMBER() OVER (PARTITION BY LinkedID ORDER BY ABS(DATEDIFF(DD, @date, DATE))) AS SEQUENCE
    FROM MyTable
    )

SELECT *
FROM cte
WHERE SEQUENCE = 1
ORDER BY ID
;

You didn't indicate how you want to handle the case where multiple rows in a LinkedID group represent the closest to the target date. This solution will only include one row And, in this case you can't guarantee which row of the multiple valid values is included.

You can change ROW_NUMBER() with RANK() in the query if you want to include all rows that represent the closest value.

Solution 2

You want to look at the absolute value of the DATEDIFF function (http://msdn.microsoft.com/en-us/library/ms189794.aspx) by days.

The query can look something like this (not tested)

with absDates as 
(
   select *, abs(DATEDIFF(day, Date_Column, '2010/10/01')) as days
   from table
), mdays as
( 
   select min(days) as mdays, linkedid
   from absDates
   group by linkedid
)
select * 
from absdates
inner join mdays on absdays.linkedid = mdays.linkedid and absdays.days = mdays.mdays

12,617

Author by

Iain Ward

Software Developer currently working in Solihull, Birmingham developing web systems in C# and ASP.NET

Updated on June 12, 2022

Comments

Iain Ward about 2 years
From the data below I need to select the record nearest to a specified date for each Linked ID using SQL Server 2005:
```
ID     Date      Linked ID
...........................
1    2010-09-02     25
2    2010-09-01     25
3    2010-09-08     39
4    2010-09-09     39
5    2010-09-10     39
6    2010-09-10     34
7    2010-09-29     34
8    2010-10-01     37
9    2010-10-02     36
10   2010-10-03     36
```
So selecting them using 01/10/2010 should return:
```
1    2010-09-02     25 
5    2010-09-10     39
7    2010-09-29     34 
8    2010-10-01     37
9    2010-10-02     36
```
I know this must be possible, but can't seem to get my head round it (must be too near the end of the day :P) If anyone can help or give me a gentle shove in the right direction it would be greatly appreciated!

EDIT: Also I have come across this sql to get the closest date:
```
abs(DATEDIFF(minute, Date_Column, '2010/10/01'))
```
but couldn't figure out how to incorporate into the query properly...

Thanks
Iain Ward over 13 years

@Hogan Ah yes I have come across it already, but forgot to mention that in the question, so I've updated it. Thanks for mentioning that
Hogan over 13 years

@w69rdy : example query added.
Hogan over 13 years

this is nicer than my query (only one select) but mine may be clearer to a beginner...
Sean Reilly over 13 years

The example query might not be right -- it should be min(days), otherwise you would return the largest difference, right? Also, I don't think that the performance on this will be very good at all. In general I would recommend using the ROW_NUMBER() solution. It should be more straightforward and more performant.
bobs over 13 years

@Hogan, your query fails right now. Where is table absdays defined?
Sean Reilly over 13 years

More information on ROW_NUMBER() is available here: msdn.microsoft.com/en-us/library/ms186734.aspx
Hogan over 13 years

@bobs : It wasn't :D I fixed the typo.
Hogan over 13 years

@Sean : I changed the max to min -- and I agree that Row_Number() will be more performant. I don't agree that it is clearer since it is a MS only solution and not part of standard SQL. Of course that does ring hollow since I'm using CTEs.
Hogan over 13 years

Wow, this is def. least performant -- a select for every linkedid.
Iain Ward over 13 years

+1 @Hogan Thanks for your answer, it works but I think I prefer bobs as its a little bit slicker and removes duplicates :)
pcofre over 13 years

Yes, it´s just for small tables
Thomas over 13 years

@Hogan - I'm not sure I agree. If you are going to use a CTE anyway, then you might as well take advantage of Row_Number().
Iain Ward over 13 years

Thanks for your answer. Duplicates are irrelevant, as long as it returns one thats all I need
Hogan over 13 years

@Thomas : Yeah, if you seem my comment on my answer to @Sean I basically said as much (and voted for your answer.)
Hogan over 13 years

@w69rdy : yep, it is a better TSQL answer. This is the way to do it if you don't have TSQL (and used temp tables instead of CTE
Sean Reilly over 13 years

@Hogan: it's worth nothing that if temp tables were used, performance probably wouldn't be a problem. I would say this might be the preferred solution for databases other than mssql 2005 and later.
Hogan over 13 years

@Sean : I agree unless that SQL includes functionally that allows for Ranking functionality.