Left outer Join query returns duplicates in SQL Server
I'd suggest reading on cartesian product.
If you have 50 rows in the first table and 70 in the second that makes 3500 rows. The join condition tbl1.SSN = tbl2.SSN
will filter out rows but you may well end up with more than 50 rows.
Back to your problem you can see what is happening by trying the following :
SELECT
tbl1.*,
(SELECT COUNT(*) FROM tbl2 WHERE tbl1.SSN = tbl2.SSN) AS NbResultTbl2
FROM
tbl1
This will tell which rows of tbl1
has multiple match in tbl2
. If you have a number higher than 1 in the NbResultTbl2
column then you are going to end up with duplicates.
To eliminate those duplicates you can try this :
SELECT
tbl1.*,
(SELECT TOP 1 StateCode FROM tbl2 WHERE tbl1.SSN = tbl2.SSN)
FROM
tbl1
This will get the first StateCode
found for a matching SNN in tbl2.
sandr
Updated on June 04, 2022Comments
-
sandr almost 2 years
I have a table 1 (
MID, SSN, ...
)MID
is primary key and table 2 (ID, SSN, StateCode..
) whereID
andSSN
make up the primary key. I'm trying to display all columns from table 1 along withStateCode
from table 2 matching it againstSSN
. Tbl 1 has 50 rows and some have sameSSN
values.If no
SSN
match is found from table 2, displaying a NULL inStateCode
is acceptable, so I chose left join. Here is my querySelect tbl1.*, tbl2.StateCode from tbl1 left outer join tbl2 on tbl1.SSN = tbl2.SSN
I'm looking to retrieve 50 records, but I get 70, rows that contain the same ssn value in tbl1 ends up duplicated in the final output. What is going wrong?
-
sandr almost 10 yearsI tried this, and I still get 70 rows on doing the left outer join in this way, all the rows with same ssn gets duplicated more. I'm not sure why this is
-
sandr almost 10 yearsWhen i do distinct im actually losing data its returning fewer rows.
-
sandr almost 10 yearsquery didn't work, grouping wasn't possible as the other columns didn't comply under this condition, i'm trying to select all of table 1 and one column from table 2
-
sandr almost 10 yearsi want all the columns from table 1 along with statecode from table 2, when i do distinct i'm losing data, fewer than 50 rows are being returne
-
sandr almost 10 yearsi tried reading through although i couldn't infer why a left join would retrieve more rows than what exists in tbl1, since ssn in table 2 is distinct
-
user3841709 almost 10 yearsI'm a little confused from your original question where you say displaying a null is acceptable. have you tried adding a Where condition such as WHERE ColumnName IS NOT NULL
-
sandr almost 10 yearsNULL is acceptable as a value in StateCode if an ssn from table 1 is not found in table 2. I'm trying to show all 50 rows along with StateCode that match the SSN for that row
-
user3841709 almost 10 yearsAlso if you are looking to retrieve 50 records, do your tables each have more than 50 records or does one have exactly 50 records? If you have exactly 50 in one table then it should be simple just figure out the correct joining
-
sandr almost 10 yearsYes table 1 has exactly 50 rows, and table 2 contains distinct SSN and the Statecode value. I'm not sure why my left join is retrieving duplicates and final result more than what is present in table 1
-
Gordon Linoff almost 10 years@sandr . . . Maybe
select distinct
will remove the rows that you think are duplicates. -
Chris Latta almost 10 yearsOkay, try a nested select that returns only one state code for each SSN
-
ForguesR almost 10 yearsSee updated answer to check if
ssn
in table is really distinct. -
sandr almost 10 yearsThank you, it was not distinct, I came up with another query but yours is short and clean! Very helpful