Efficiency of joining subqueries in SQL Server
Solution 1
You can remove one of your subqueries to make it a little more efficient:
SELECT c.id
,c.country
,First_orders.product
,First_orders.order_id
FROM customers c
LEFT JOIN (SELECT id
,product
,order_id
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No
FROM orders) First_Orders
ON c.id = First_orders.id AND First_Orders.order_No = 1
In your above query, you need to be careful where you place your parentheses as I don't think it will work. Also, you're returning product in your results, but not including in your nested subquery.
Solution 2
For someone who is just learning SQL, your query looks pretty good.
The index on customers may or may not be used for the query -- you would need to look at the execution plan. An index on orders(id, order_date)
could be used quite effectively for the row_number
function.
One comment is on the naming of fields. The field orders.id
should not be the customer id. That should be something like 'orders.Customer_Id`. Keeping the naming system consistent across tables will help you in the future.
Solution 3
Try this...its easy to understand
;WITH cte
AS (
SELECT id
,product
,order_id
,ROW_NUMBER() OVER (
PARTITION BY id ORDER BY Order_Date ASC
) AS order_No
FROM orders
)
SELECT c.id
,c.country
,c1.Product
,c1.order_id
FROM customers c
INNER JOIN cte c1 ON c.id = c1.id
WHERE c1.order_No = 1
user2112153
Updated on December 27, 2020Comments
-
user2112153 over 3 years
I have a customers and orders table in SQL Server 2008 R2. Both have indexes on the customer id (called
id
). I need to return details about all customers in the customers table and information from the orders table, such as details of the first order.I currently left join my customers table on a subquery of the orders table, with the subquery returning the information I need about the orders. For example:
SELECT c.id ,c.country ,First_orders.product ,First_orders.order_id FROM customers c LEFT JOIN SELECT( id, product FROM (SELECT id ,product ,order_id ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No FROM orders) orders WHERE Order_no = 1) First_Orders ON c.id = First_orders.id
I'm quite new to SQL and want to understand if I'm doing this efficiently. I end up left joining quite a few subqueries like this onto the customers table in one select query and it can take tens of minutes to run.
So am I doing this efficiently or can it be improved? For example, I'm not sure if my index on id in the orders table is of any use and maybe I could speed up the query by creating a temporary table of what is in the subquery first and creating a unique index on id in the temporary table so SQL Server knows
id
is now a unique column and then joining my customers table to this temporary table? I typically have one or two million rows in the customers and orders tables.Many thanks in advance!
-
Mahesh Bongani over 2 yearsIf the orders table is too huge and without filtering with customer ids it has to perform lot of operations inside the CTE.
-
Mahesh Bongani over 2 yearsIf the orders table is too huge and without filtering with customer ids it has to perform lot of operations inside the subquery as it is not correlated.