How can I select the nearest value less-than and greater-than a given value efficiently?

28,210

For simplicity you may at least use MAX() and MIN() functions for querying timestamp field instead of TOP 1 and ORDER BY.

Full query will be

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
 FROM 
    dbo.value V 
Share:
28,210
brianestey
Author by

brianestey

Software developer in Tokyo, Japan.

Updated on July 09, 2022

Comments

  • brianestey
    brianestey almost 2 years

    I have two tables, one for values one for location and am trying to interpolate location. The tables have been simplified to the following:

    CREATE TABLE value(
        Timestamp DATETIME2,
        Value float NOT NULL,
        PRIMARY KEY(Timestamp)
    );
    
    CREATE TABLE location(
        Timestamp DATETIME2,
        Position INT NOT NULL,
        PRIMARY KEY(Timestamp)
    ); 
    
    INSERT INTO value VALUES 
        ('2011/12/1 16:55:01', 1),
        ('2011/12/1 16:55:02', 5),
        ('2011/12/1 16:55:05', 10),
        ('2011/12/1 16:55:08', 6);
    
    INSERT INTO location VALUES 
        ('2011/12/1 16:55:00', 0),
        ('2011/12/1 16:55:05', 10),
        ('2011/12/1 16:55:10', 5)
    

    The expected results would be

    TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation
    2011-12-01 16:55:01,  1, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
    2011-12-01 16:55:02,  5, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
    2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10
    2011-12-01 16:55:08,  6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10,  5
    

    (Keep in mind this is simplified sample data to get the idea of the query I am trying to perform across.)

    To do the interpolation, I need to figure out the time and locations before and after a given values time. I am currently doing this with a query that looks like:

    SELECT 
        V.Timestamp, 
        V.Value, 
        (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
        (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
        (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
        (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
     FROM 
        dbo.value V 
    

    Now this works, but this obviously is doing a lot of work. I'm thinking there must be a query simplification that I'm missing but I've been playing with it all morning and haven't come up with anything concrete. Hoping someone here has a better idea.

    I am currently exploring if there is a way to figure out the LowerTime and UpperTime and use those in determining the Locations. Something like:

    SELECT 
        V.Timestamp, 
        V.Value, 
        (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
        (SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation,
        (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
        (SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation
     FROM 
        dbo.value V 
    

    but this doesn't work.

    EDIT1: Updated query as suggested. However no visible change in execution time.

    EDIT2: Added my thoughts of the approach I am currently trying.

  • brianestey
    brianestey over 12 years
    Thanks for the tip! That's a good point - I will try that. I am hoping to somehow reduce or get rid of those 4 queries though.
  • brianestey
    brianestey over 12 years
    Thanks for the input - I don't know what this is doing exactly. Running it on a simple set of data yielded no results. I will study your answer and see if helps.
  • Damien_The_Unbeliever
    Damien_The_Unbeliever over 12 years
    @brianestey - well, if you have some sample data and expected results, please add them to your question - it's a lot easier for us to write working queries if we're able to test them ourselves before posting them as answers.
  • brianestey
    brianestey over 12 years
    good idea, I've updated the question again with some data points I am using for testing and also the expected output I am looking for.
  • Damien_The_Unbeliever
    Damien_The_Unbeliever over 12 years
    @brianestey - I've now partitioned by v.Timestamp in my CTE, and get the expected results you've posted. (I'd initially tested with only a single row in the value table, oops)
  • brianestey
    brianestey over 12 years
    Thanks for the update. I've tested this and it does give the expected results but I am having some performance issues with this. For example, I inserted 5000 rows into the value table, 1250 rows into the location table and the original query completes in 0~1 second, but this query takes minutes with the same data (~5.5 minutes). Does that sound reasonable to you?
  • brianestey
    brianestey about 12 years
    Thanks. I've marked your answer as the answer as it was the best of the options given. This did help a bit.