SQL Find the first non null value with arbitrary order by

13,730

Solution 1

Just do :

SELECT TOP 1 Value
FROM mytable
WHERE Value IS NOT NULL
ORDER BY ID

To get the last not null value switch the ordering:

ORDER BY ID DESC

To expand the same logic to a greater number of columns you can use the following query:

SELECT (SELECT TOP 1 Value1 FROM mytable 
        WHERE Value1 IS NOT NULL ORDER BY ID) AS min_Value1,
       (SELECT TOP 1 Value2 FROM mytable 
        WHERE Value2 IS NOT NULL ORDER BY ID) AS min_Value2

Demo here

Solution 2

Original answer is definitely the best for the provided example.

For those out there who need to do this within particular partitions and at a large volume, this becomes problematic and intensive, as it can translate into a huge number of individual table hits on the back end.

I have a solution using window functions that solves the problem for large data / partitions.

Expanding the original example data as follows - let's have two sets in the data, based on GroupID (1 and 2):

GroupID | ID | Value1 | Value2
--------------------------
1       |  1 |  NULL  |   AA
1       |  2 |   B    |  NULL
1       |  3 |   C    |   CC
1       |  4 |  NULL  |  NULL
2       |  5 |   E    |   EE
2       |  6 |   F    |  NULL
2       |  7 |  NULL  |   GG
2       |  8 |  NULL  |  NULL

And I want to provide these first/last non-null values for each GroupID value, like this:

GroupID | FirstValue1 | FirstValue2 | LastValue1 | LastValue2
-------------------------------------------------------------
1       | B           | AA          | C          | CC
2       | E           | EE          | F          | GG

If I apply the top 1 logic to scale, and I have lots of rows, it's reaching back to perform tons of individual table queries. However, if I use window functions, I can have it evaluate data it has in memory to more efficiently get the data, esp when partitions are needed and lots of data.

Here is the solution:

SELECT
  GroupID,
  FirstValue1,
  FirstValue2,
  LastValue1,
  LastValue2
FROM
  (
  SELECT
    GroupID,
    ID, Value1, Value2,
    -- Sets a rank so we can reduce to 1 row per GroupID in outer query
    DENSE_RANK() OVER (
      PARTITION BY GroupID
      ORDER BY ID
    ) IDRank,
    FIRST_VALUE(Value1) OVER (
      PARTITION BY GroupID
      ORDER BY
        CASE WHEN Value1 IS NULL THEN 2 ELSE 1 END, -- Prioritize non-null Value1
        ID                                          -- And ascending ID
    ) FirstValue1,
    FIRST_VALUE(Value2) OVER (
      PARTITION BY GroupID
      ORDER BY
        CASE WHEN Value2 IS NULL THEN 2 ELSE 1 END, -- Prioritize non-null Value2
        ID                                          -- And ascending ID
    ) FirstValue2,
    FIRST_VALUE(Value1) OVER (
      PARTITION BY GroupID
      ORDER BY
        CASE WHEN Value1 IS NULL THEN 2 ELSE 1 END, -- Prioritize non-null Value1
        ID DESC                                     -- And descending ID
    ) LastValue1,
    FIRST_VALUE(Value2) OVER (
      PARTITION BY GroupID
      ORDER BY
        CASE WHEN Value2 IS NULL THEN 2 ELSE 1 END, -- Prioritize non-null Value2
        ID DESC                                     -- And descending ID
    ) LastValue2
  FROM MY_TABLE
  ) BestValues
WHERE IDRank = 1  --Ensures we get only one row per GroupID
ORDER BY GroupID

Explained - FIRST_VALUE evaluates within each distinct GroupID (since we set PARTITION BY GroupID). Within that window, it evaluates, of the rows for that GroupID, cases with non-null values, and then ordering by ID (ascending or descending). As it is a window function, this puts the value on the level of the records in their original granularity, so we also add a DENSE_RANK to rank and give us the ability to, in the outer query, get down to one row per GroupID.


p.s., if you would like to run it yourself, include the following above the query to generate the needed sample data into "MY_TABLE" in a CTE.

WITH MY_TABLE AS (
  SELECT 0 as GroupID, 0 AS ID, CAST(NULL AS VARCHAR(10)) as Value1, CAST(NULL AS VARCHAR(10)) as Value2 WHERE 0=1 UNION ALL --Dummy row to set types

  SELECT 1 AS GroupID, 1 AS ID, NULL AS Value1, 'AA' AS Value2 UNION ALL
  SELECT 1 AS GroupID, 2 AS ID, 'B'  AS Value1, NULL AS Value2 UNION ALL
  SELECT 1 AS GroupID, 3 AS ID, 'C'  AS Value1, 'CC' AS Value2 UNION ALL
  SELECT 1 AS GroupID, 4 AS ID, NULL AS Value1, NULL AS Value2 UNION ALL
  SELECT 2 AS GroupID, 5 AS ID, 'E'  AS Value1, 'EE' AS Value2 UNION ALL
  SELECT 2 AS GroupID, 6 AS ID, 'F'  AS Value1, NULL AS Value2 UNION ALL
  SELECT 2 AS GroupID, 7 AS ID, NULL AS Value1, NULL AS Value2 UNION ALL
  SELECT 2 AS GroupID, 8 AS ID, NULL AS Value1, 'GG' AS Value2
)

Solution 3

In MySQL and I believe most versions you can use the Coalesce function, it simply returns the first non-null value in the list.

Your query would end up being

SELECT COALESCE(Value ORDER BY ID) FROM MY_TABLE

Example here

Share:
13,730
user1589188
Author by

user1589188

Updated on June 04, 2022

Comments

  • user1589188
    user1589188 about 2 years

    Given the following table:

    ID | Value
    ----------
     1 | NULL
     2 |  B
     3 |  C
     4 | NULL
    

    I want to get the first non-null value from the Value column according to different ordering, e.g.:

    SELECT FIRST_NON_NULL(Value ORDER BY ID) FROM MY_TABLE
    

    This will return B

    SELECT FIRST_NON_NULL(Value ORDER BY ID DESC) FROM MY_TABLE
    

    This will return C

    P.S. Doesn't have to be a value function, just the SELECT statement that is needed. Thank you.


    Edited: Want to see if there can be a scalable version.

    ID | Value1 | Value2
    --------------------
     1 |  NULL  |   AA
     2 |   B    |  NULL
     3 |   C    |   CC
     4 |  NULL  |  NULL
    
    SELECT FIRST_NON_NULL(Value1, Value2 ORDER BY ID) FROM MY_TABLE
    

    This will return B, AA

    SELECT FIRST_NON_NULL(Value1, Value2 ORDER BY ID DESC) FROM MY_TABLE
    

    This will return C, CC