Delete all Duplicate Rows except for One in MySQL?
Solution 1
Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.
NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also included AND n1.id <> n2.id
, it deleted every row in the table.
If you want to keep the row with the lowest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
If you want to keep the row with the highest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE
, please be advised that using INSERT
and DISTINCT
is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE
, it took more than 2 hours and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;
Solution 2
If you want to keep the row with the lowest id
value:
DELETE FROM NAMES
WHERE id NOT IN (SELECT *
FROM (SELECT MIN(n.id)
FROM NAMES n
GROUP BY n.name) x)
If you want the id
value that is the highest:
DELETE FROM NAMES
WHERE id NOT IN (SELECT *
FROM (SELECT MAX(n.id)
FROM NAMES n
GROUP BY n.name) x)
The subquery in a subquery is necessary for MySQL, or you'll get a 1093 error.
Highway of Life
David has worked in Software Development for over 20 years, from every phase of development, build, deployment, systems engineering, operations, and Site Reliability Engineering (DevOps). David is currently the Lead Site Reliability Engineer for the SRE Product team at Starbucks Technology where the focus is on developing applications and shared platforms to enable rapid development by engineering teams across Starbucks.
Updated on July 31, 2022Comments
-
Highway of Life almost 2 years
How would I delete all duplicate data from a MySQL Table?
For example, with the following data:
SELECT * FROM names; +----+--------+ | id | name | +----+--------+ | 1 | google | | 2 | yahoo | | 3 | msn | | 4 | google | | 5 | google | | 6 | yahoo | +----+--------+
I would use
SELECT DISTINCT name FROM names;
if it were aSELECT
query.How would I do this with
DELETE
to only remove duplicates and keep just one record of each?