How to Join to first row

699,631

Solution 1

SELECT   Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM     Orders
JOIN     LineItems
ON       LineItems.LineItemGUID =
         (
         SELECT  TOP 1 LineItemGUID 
         FROM    LineItems
         WHERE   OrderID = Orders.OrderID
         )

In SQL Server 2005 and above, you could just replace INNER JOIN with CROSS APPLY:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
CROSS APPLY
        (
        SELECT  TOP 1 LineItems.Quantity, LineItems.Description
        FROM    LineItems
        WHERE   LineItems.OrderID = Orders.OrderID
        ) LineItems2

Please note that TOP 1 without ORDER BY is not deterministic: this query you will get you one line item per order, but it is not defined which one will it be.

Multiple invocations of the query can give you different line items for the same order, even if the underlying did not change.

If you want deterministic order, you should add an ORDER BY clause to the innermost query.

Example sqlfiddle

Solution 2

I know this question was answered a while ago, but when dealing with large data sets, nested queries can be costly. Here is a different solution where the nested query will only be ran once, instead of for each row returned.

SELECT 
  Orders.OrderNumber,
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders
  INNER JOIN (
    SELECT
      Orders.OrderNumber,
      Max(LineItem.LineItemID) AS LineItemID
    FROM
      Orders INNER JOIN LineItems
      ON Orders.OrderNumber = LineItems.OrderNumber
    GROUP BY Orders.OrderNumber
  ) AS Items ON Orders.OrderNumber = Items.OrderNumber
  INNER JOIN LineItems 
  ON Items.LineItemID = LineItems.LineItemID

Solution 3

@Quassnoi answer is good, in some cases (especially if the outer table is big), a more efficient query might be with using windowed functions, like this:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
LEFT JOIN 
        (
        SELECT  LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER()
                OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNum
        FROM    LineItems

        ) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

Sometimes you just need to test which query gives better performance.

Solution 4

You could do:

SELECT 
  Orders.OrderNumber, 
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders INNER JOIN LineItems 
  ON Orders.OrderID = LineItems.OrderID
WHERE
  LineItems.LineItemID = (
    SELECT MIN(LineItemID) 
    FROM   LineItems
    WHERE  OrderID = Orders.OrderID
  )

This requires an index (or primary key) on LineItems.LineItemID and an index on LineItems.OrderID or it will be slow.

Solution 5

From SQL Server 2012 and onwards I think this will do the trick:

SELECT DISTINCT
    o.OrderNumber ,
    FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity ,
    FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description
FROM    Orders AS o
    INNER JOIN LineItems AS li ON o.OrderID = li.OrderID
Share:
699,631
mistertodd
Author by

mistertodd

Any code is public domain. No attribution required. జ్ఞా <sup>🕗</sup>🕗 Yes, i do write i with a lowercase i. The Meta Stackexchange answer that I am most proud of

Updated on January 12, 2021

Comments

  • mistertodd
    mistertodd over 3 years

    I'll use a concrete, but hypothetical, example.

    Each Order normally has only one line item:

    Orders:

    OrderGUID   OrderNumber
    =========   ============
    {FFB2...}   STL-7442-1      
    {3EC6...}   MPT-9931-8A
    

    LineItems:

    LineItemGUID   Order ID Quantity   Description
    ============   ======== ========   =================================
    {098FBE3...}   1        7          prefabulated amulite
    {1609B09...}   2        32         spurving bearing
    

    But occasionally there will be an order with two line items:

    LineItemID   Order ID    Quantity   Description
    ==========   ========    ========   =================================
    {A58A1...}   6,784,329   5          pentametric fan
    {0E9BC...}   6,784,329   5          differential girdlespring 
    

    Normally when showing the orders to the user:

    SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
    FROM Orders
        INNER JOIN LineItems 
        ON Orders.OrderID = LineItems.OrderID
    

    I want to show the single item on the order. But with this occasional order containing two (or more) items, the orders would appear be duplicated:

    OrderNumber   Quantity   Description
    ===========   ========   ====================
    STL-7442-1    7          prefabulated amulite
    MPT-9931-8A   32         spurving bearing
    KSG-0619-81   5          panametric fan
    KSG-0619-81   5          differential girdlespring
    

    What I really want is to have SQL Server just pick one, as it will be good enough:

    OrderNumber   Quantity   Description
    ===========   ========   ====================
    STL-7442-1    7          prefabulated amulite
    MPT-9931-8A   32         differential girdlespring
    KSG-0619-81   5          panametric fan
    

    If I get adventurous, I might show the user, an ellipsis to indicate that there's more than one:

    OrderNumber   Quantity   Description
    ===========   ========   ====================
    STL-7442-1    7          prefabulated amulite
    MPT-9931-8A   32         differential girdlespring
    KSG-0619-81   5          panametric fan, ...
    

    So the question is how to either

    • eliminate "duplicate" rows
    • only join to one of the rows, to avoid duplication

    First attempt

    My first naive attempt was to only join to the "TOP 1" line items:

    SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
    FROM Orders
        INNER JOIN (
           SELECT TOP 1 LineItems.Quantity, LineItems.Description
           FROM LineItems
           WHERE LineItems.OrderID = Orders.OrderID) LineItems2
        ON 1=1
    

    But that gives the error:

    The column or prefix 'Orders' does not
    match with a table name or alias name
    used in the query.

    Presumably because the inner select doesn't see the outer table.