Efficiency of joining subqueries in SQL Server

58,082

Solution 1

You can remove one of your subqueries to make it a little more efficient:

SELECT c.id
        ,c.country      
        ,First_orders.product
        ,First_orders.order_id
FROM customers c
   LEFT JOIN  (SELECT id
                    ,product
                    ,order_id
                    ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No 
               FROM orders) First_Orders
     ON c.id = First_orders.id AND First_Orders.order_No = 1

In your above query, you need to be careful where you place your parentheses as I don't think it will work. Also, you're returning product in your results, but not including in your nested subquery.

Solution 2

For someone who is just learning SQL, your query looks pretty good.

The index on customers may or may not be used for the query -- you would need to look at the execution plan. An index on orders(id, order_date) could be used quite effectively for the row_number function.

One comment is on the naming of fields. The field orders.id should not be the customer id. That should be something like 'orders.Customer_Id`. Keeping the naming system consistent across tables will help you in the future.

Solution 3

Try this...its easy to understand

;WITH cte
AS (
    SELECT id
        ,product
        ,order_id
        ,ROW_NUMBER() OVER (
            PARTITION BY id ORDER BY Order_Date ASC
            ) AS order_No
    FROM orders
    )
SELECT c.id
    ,c.country
    ,c1.Product
    ,c1.order_id
FROM customers c
INNER JOIN cte c1 ON c.id = c1.id
WHERE c1.order_No = 1
Share:
58,082
user2112153
Author by

user2112153

Updated on December 27, 2020

Comments

  • user2112153
    user2112153 over 3 years

    I have a customers and orders table in SQL Server 2008 R2. Both have indexes on the customer id (called id). I need to return details about all customers in the customers table and information from the orders table, such as details of the first order.

    I currently left join my customers table on a subquery of the orders table, with the subquery returning the information I need about the orders. For example:

    SELECT c.id
            ,c.country      
            ,First_orders.product
            ,First_orders.order_id
    FROM customers c
    
    LEFT JOIN   SELECT( id, 
                        product 
                FROM (SELECT    id
                                ,product
                                ,order_id
                                ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No 
                            FROM orders) orders
                WHERE Order_no = 1) First_Orders
    ON c.id = First_orders.id
    

    I'm quite new to SQL and want to understand if I'm doing this efficiently. I end up left joining quite a few subqueries like this onto the customers table in one select query and it can take tens of minutes to run.

    So am I doing this efficiently or can it be improved? For example, I'm not sure if my index on id in the orders table is of any use and maybe I could speed up the query by creating a temporary table of what is in the subquery first and creating a unique index on id in the temporary table so SQL Server knows id is now a unique column and then joining my customers table to this temporary table? I typically have one or two million rows in the customers and orders tables.

    Many thanks in advance!

  • Mahesh Bongani
    Mahesh Bongani over 2 years
    If the orders table is too huge and without filtering with customer ids it has to perform lot of operations inside the CTE.
  • Mahesh Bongani
    Mahesh Bongani over 2 years
    If the orders table is too huge and without filtering with customer ids it has to perform lot of operations inside the subquery as it is not correlated.