How can I select the nearest value less-than and greater-than a given value efficiently?
For simplicity you may at least use MAX()
and MIN()
functions for querying timestamp
field instead of TOP 1
and ORDER BY
.
Full query will be
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
FROM
dbo.value V
Comments
-
brianestey almost 2 years
I have two tables, one for values one for location and am trying to interpolate location. The tables have been simplified to the following:
CREATE TABLE value( Timestamp DATETIME2, Value float NOT NULL, PRIMARY KEY(Timestamp) ); CREATE TABLE location( Timestamp DATETIME2, Position INT NOT NULL, PRIMARY KEY(Timestamp) ); INSERT INTO value VALUES ('2011/12/1 16:55:01', 1), ('2011/12/1 16:55:02', 5), ('2011/12/1 16:55:05', 10), ('2011/12/1 16:55:08', 6); INSERT INTO location VALUES ('2011/12/1 16:55:00', 0), ('2011/12/1 16:55:05', 10), ('2011/12/1 16:55:10', 5)
The expected results would be
TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation 2011-12-01 16:55:01, 1, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10 2011-12-01 16:55:02, 5, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10 2011-12-01 16:55:08, 6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10, 5
(Keep in mind this is simplified sample data to get the idea of the query I am trying to perform across.)
To do the interpolation, I need to figure out the time and locations before and after a given values time. I am currently doing this with a query that looks like:
SELECT V.Timestamp, V.Value, (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime, (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation, (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime, (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation FROM dbo.value V
Now this works, but this obviously is doing a lot of work. I'm thinking there must be a query simplification that I'm missing but I've been playing with it all morning and haven't come up with anything concrete. Hoping someone here has a better idea.
I am currently exploring if there is a way to figure out the LowerTime and UpperTime and use those in determining the Locations. Something like:
SELECT V.Timestamp, V.Value, (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime, (SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation, (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime, (SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation FROM dbo.value V
but this doesn't work.
EDIT1: Updated query as suggested. However no visible change in execution time.
EDIT2: Added my thoughts of the approach I am currently trying.
-
brianestey over 12 yearsThanks for the tip! That's a good point - I will try that. I am hoping to somehow reduce or get rid of those 4 queries though.
-
brianestey over 12 yearsThanks for the input - I don't know what this is doing exactly. Running it on a simple set of data yielded no results. I will study your answer and see if helps.
-
Damien_The_Unbeliever over 12 years@brianestey - well, if you have some sample data and expected results, please add them to your question - it's a lot easier for us to write working queries if we're able to test them ourselves before posting them as answers.
-
brianestey over 12 yearsgood idea, I've updated the question again with some data points I am using for testing and also the expected output I am looking for.
-
Damien_The_Unbeliever over 12 years@brianestey - I've now partitioned by
v.Timestamp
in my CTE, and get the expected results you've posted. (I'd initially tested with only a single row in thevalue
table, oops) -
brianestey over 12 yearsThanks for the update. I've tested this and it does give the expected results but I am having some performance issues with this. For example, I inserted 5000 rows into the value table, 1250 rows into the location table and the original query completes in 0~1 second, but this query takes minutes with the same data (~5.5 minutes). Does that sound reasonable to you?
-
brianestey about 12 yearsThanks. I've marked your answer as the answer as it was the best of the options given. This did help a bit.