SQL LEFT JOIN first row only

sql sql-server join

15,463

Solution 1

Wrapping it in another query does the trick?

SELECT RequiredId, <all_the_other_fields> from (
SELECT t2.SomeId AS RequiredId
-- ...other data mainly from t2
FROM DataTable1 AS t1
   LEFT JOIN DataTable2 AS t2 
   ON t2.OtherId = t1.ExperienceId 
   AND t2.LanguageId = 
      (SELECT TOP 1 t1.LanguageId
      ORDER BY t1.LanguageId)
   ) group by RequiredId, <all_the_other_fields>

or even not extracting the column in the first place?

SELECT distinct t2.SomeId AS RequiredId
-- ...other data mainly from t2 BUT not the Language id
FROM DataTable1 AS t1
   LEFT JOIN DataTable2 AS t2 
   ON t2.OtherId = t1.ExperienceId 
   AND t2.LanguageId = 
      (SELECT TOP 1 t1.LanguageId
      ORDER BY t1.LanguageId)

Solution 2

Try this:

;with cte as 
(select *, row_number() over (partition by someid order by languageid) rn
 from datatable2)

select * 
from datatable1 dt
left join cte c on dt.experienceid = c.otherid and c.rn = 1

Solution 3

For such things when you need to select top in the subquery CROSS APPLY OR 'OUTER APPLY' is very handy

t2.SomeId AS RequiredId
    -- ...other data mainly from t2
FROM DataTable1 AS t1
OUTER APPLY ( SELECT TOP 1 t1.LanguageId
               FROM DataTable2 
               WHERE DataTable2 .OtherId = t1.ExperienceId 
                AND  t2.LanguageId = t1.LanguageId
             ORDER BY t1.LanguageId
             ) AS t2

15,463

Author by

YOhan

Updated on June 04, 2022

Comments

YOhan almost 2 years

Let's assume we have such data set:

Table: DataTable1

ID      ExperienceId   LanguageId    ...
-------------------------------------------
1       1              1
2       1              2
3       1              3
4       2              1
5       2              2
6       2              3
7       3              1
8       3              2
9       3              3
...

Table: DataTable2

ID      SomeId OtherId LanguageId    ...
-------------------------------------------
1       459    1       1
2       459    1       2
3       459    1       3
4       245    2       1
5       245    2       2
6       245    2       3
7       295    3       1
8       295    3       2
9       295    3       3
...

I want to join those tables and get only SomeId column ignoring the LanguageId column. To make it clearer:

SELECT
    t2.SomeId AS RequiredId
    -- ...other data mainly from t2
FROM DataTable1 AS t1
LEFT JOIN DataTable2 AS t2 
    ON t2.OtherId = t1.ExperienceId 
    AND t2.LanguageId = 
        (SELECT TOP 1 t1.LanguageId
         ORDER BY t1.LanguageId)

This query should return (if it wasn't wrong, clearly) rows:

SomeId    ...
----------------
459       ...
245       ...
295       ...
...

Now it returns three times of identical data (with only LanguageId different).

I would try to filter it with WHERE t1.LanguageId = 1 if I was sure it always exists, but I'm not sure. Rows can be with LanguageId from 1 to 3, also they can be only with ID 2, etc. Rows surely will have at least one LanguageId.

Now my question is: how can I join tables with unique values with one column completely ignored?

YOhan almost 10 years

Thank you, first approach works just fine :) Had problems with EF until I saw that my GROUP BY clause was inside inner query, not outside of it. Changed that and now it works.