Select all but ignore duplicates

14,447

Solution 1

@AlexW - it's just the column 'url' where there could be a duplicate – Ruf1 9 mins ago

Then your first query will work if you correct the syntax - GROUP BY must follow WHERE (per the docs):

SELECT *
FROM directory_listings
WHERE status = 'approved'
GROUP BY url
ORDER BY site_name ASC

Here's an example of a working query in SQL Fiddle.

Solution 2

Your syntax for SELECT DISTINCT is wrong:

http://dev.mysql.com/doc/refman/5.6/en/select.html

Also, the only reason GROUP BY wouldn't work to eliminate duplicates is if the WHERE clause is disqualifying some of the rows (i.e. they are not duplicates in terms of both status and url).

Share:
14,447
Ruf1
Author by

Ruf1

Updated on June 04, 2022

Comments

  • Ruf1
    Ruf1 almost 2 years

    I have a mysql table in which one field will hold duplicates. I am trying to select all but ignore all rows where a duplicate exists in this field.

    So if for example I have 10 rows in total, and 3 of them have duplicates I like to return 8 rows. The 7 that were unique and 1 of the 3 duplicates.

    I have tried distinct and group by without success. They ignore all 3 duplicates.

    Here's what I tried:

    SELECT *
    FROM directory_listings
    GROUP BY url
    WHERE status = 'approved'
    ORDER BY site_name ASC
    LIMIT $start, $per_page
    

    and

    SELECT * DISTINCT url
    FROM directory_listings
    WHERE status = 'approved'
    ORDER BY site_name ASC
    LIMIT $start, $per_page
    
  • Ruf1
    Ruf1 about 10 years
    Works perfectly, thank you so much.
  • Air
    Air about 10 years
    +1 for a good point about the WHERE clause. IMO this answer would be much improved by giving the correct syntax explicitly instead of only linking the docs. I know when I was starting out with SQL, I found the syntax descriptions in the docs very confusing; and external links can be unstable.
  • Ruf1
    Ruf1 about 10 years
    i also need a similar query to ascertain the number of unique rows. would this be correct or do I use group: "SELECT COUNT(DISTINCT url) FROM directory_listings WHERE status = 'approved'"
  • Air
    Air about 10 years
    That should work fine; see also stackoverflow.com/q/4131937/2359271. If you used GROUP BY you would be getting separate counts for each group (which can be useful to find duplicates, as in this example!)