SQL query to return one single record for each unique value in a column
Solution 1
Well, this will give you pretty bad performance, but I think it'll work
SELECT t.Name, t.Street, t.City, t.State
FROM table t
INNER JOIN (
SELECT m.Name, MIN(m.Street + ';' + m.City + ';' + m.State) AS comb
FROM table m
GROUP BY m.Name
) x
ON x.Name = t.Name
AND x.comb = t.Street + ';' + t.City + ';' + t.State
Solution 2
If you can use a temp table:
select * -- Create and populate temp table
into #Addresses
from Addresses
alter table #Addresses add PK int identity(1, 1) primary key
select Name, Street, City, State
-- Explicitly name columns here to not return the PK
from #Addresses A
where not exists
(select *
from #Addresses B
where B.Name = A.Name
and A.PK > B.PK)
This solution would not be advisable for much larger tables.
Solution 3
Use a temp table or table variable and select a distinct list of names into that. Use that structure then to select the top 1 of each record in the original table for each distinct name.
Solution 4
select distinct Name , street,city,state
from table t1 where street =
(select min(street) from table t2 where t2.name = t1.name)
Solution 5
select Name , street,city,state FROM( select Name , street,city,state, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) AS rn from table) AS t WHERE rn=1
Comments
-
B Bulfin almost 2 years
I have a table in SQL Server 2000 that I am trying to query in a specific way. The best way to show this is with example data.
Behold,
[Addresses]
:Name Street City State -------------------------------------------------------- Bob 123 Fake Street Peoria IL Bob 234 Other Street Fargo ND Jim 345 Main Street St Louis MO
This is actually a simplified example of the structure of the actual table. The structure of the table is completely beyond my control. I need a query that will return a single address per name. It doesn't matter which address, just that there is only one. The result could be this:
Name Street City State -------------------------------------------------------- Bob 123 Fake Street Peoria IL Jim 345 Main Street St Louis MO
I found a similar question here, but none of the solutions given work in my case because I do not have access to
CROSS APPLY
, and callingMIN()
on each column will mix different addresses together, and although I don't care which record is returned, it must be one intact row, not a mix of different rows.Recommendations to change the table structure will not help me. I agree that this table is terrible, (it's worse than shown here) but this is part of a major ERP database that I can not change.
There are about 3000 records in this table. There is no primary key.
Any ideas?
-
Tadmas about 15 yearsThat won't work: You can have the same street address in multiple cities.
-
B Bulfin about 15 yearsThere are actually examples of multiple rows that contain the same address, so in those cases, I'd still get duplicates.
-
B Bulfin about 15 yearsUnfortunately, there is no unique id field for this table. Yes, I know. This sucks.
-
Orion Adrian about 15 yearsI think those are part of the street name. It doesn't appear to have any keys.
-
Tadmas about 15 yearsYou can create a unique id by using a temp table / table variable. Eg: DECLARE @data TABLE (id int IDENTITY(1,1), ...) And then, insert all the data from the table into the temp table / table variable, and then use this script.
-
B Bulfin about 15 yearsYes, there are some examples of the same address occurring twice for the same name.
-
Orion Adrian about 15 yearsSee my version below on a way to change this into something that might work without keys.
-
shruti tiwari about 15 yearsDoh, keep forgetting you have the street numbers FIRST in the street address :-)
-
B Bulfin about 15 yearsI checked this. Unfortunately, there are some examples of records that have the same street.
-
Orion Adrian about 15 years@Tadmas: the problem here is now we've gone past simple queries and now we have to do this in a Procedure where before we could do it in a query or view.
-
Tadmas about 15 years@Orion Adrian: Correctness is more important than convenience.
-
Orion Adrian about 15 years@Tadmas: Really to get what you want you need to either create a primary key, get the row index, or create a hash of all the fields together to prevent collisions. Then again I don't think the OP is really sorting on just name as name/name collisions are very likely.
-
Orion Adrian about 15 yearsIt would have to be name and street for there to be a problem. You can add more columns to reduce the likelihood of collision, but overall it's not a great solution to the problem. If at all possible getting a primary key on the records is better.
-
shruti tiwari about 15 yearsChanged the query to accomodate for the missing id field
-
Orion Adrian about 15 years@Tadmas: Except when you don't have permissions to change what you need to change for correctness.
-
B Bulfin about 15 yearsThere really isn't a "name" field. This is an analogy for the structure of the table. And I do not have the ability to add any indexes or columns.
-
mbeckish about 15 yearsClose, except "table" x doesn't have a field named Street. You would need something like "SELECT Name, MIN(Street) a" and "ON x.Name = t.Name AND x.a = t.Street"
-
B Bulfin about 15 yearsWith DISTINCT, this looks promising. I think this will probably work.
-
Orion Adrian about 15 yearsThe performance for this would be very poor as you would have to use a cursor for this, unless you have code for it.
-
Gratzy about 15 yearsLooking at the source table I don't think performance is the issue.
-
B Bulfin about 15 yearsThis doesn't work because address components don't always increase or decrease together. For example, 123 < 234, but Peoria > Fargo.
-
RAmPE about 15 yearsI was writing this solution in my head as I was reading through the previous answers. It isn't a slick or pretty answer but it should work. Another option here would be to change the FROM subquery to a GROUP BY to improve performance slightly over DISTINCT.
-
B Bulfin about 15 yearsSQL 2000 does not support DENSE_RANK
-
shruti tiwari about 15 yearsneeds and ORDER BY I think to be safe
-
A-K about 15 yearsThis is wrong - you may end up with city and street from different addresses
-
Jamie Ide about 15 yearsI should have read all the answers first because this is the same as my solution. The table isn't indexed and there's only 3000 records so I don't think a cursor will be significantly slower than a strictly SQL solution. This also doesn't strike me as a query that will be run frequently.
-
Joe Davis about 15 yearsYes, Alex and recursive, the chance that it could mix up is there. However, by selecting TOP 1 on the same table using the same index should return the same row almost without fail. Depending on the size of the table, only a high frequency of updates and deletes should cause a mix up. So if this is not a high-volume OLTP table that the query is being run on, you could apply a lock and virtually guarantee yourself the same row every time.
-
A-K about 15 yearsJoe, to be safe you need to add an ORDER BY clause on something unique. Even without concurrency, different subselects can go against different indexes.
-
Peter Radocchia about 15 yearsNULLs will make this solution much longer.
-
Peter Radocchia about 15 years++1. This is an excellent answer. Short and sweet. It doesn't require you to list out all the field names individually, and it complete avoids the issue of nullable field comparison. One correlated subquery, no aggregates, no joins. For 3000 rows, this is a) the least coding, b) good performance, and c) fool-proof results. Beautiful!
-
Peter Radocchia about 15 yearsWithout candidate keys, you would have to order by the whole table to assure consistency.