A query to delete duplicates with GROUP BY
16,443
Solution 1
If you're looking for a solution for MySQL then you can use a proper multi table DELETE
syntax along with a JOIN
like this
DELETE p
FROM ps_specific_price p JOIN
(
SELECT id_product, MAX(id_specific_price) id_specific_price
FROM ps_specific_price
GROUP BY id_product
) d
ON p.id_product = d.id_product
AND p.id_specific_price <> d.id_specific_price;
Outcome:
| ID_SPECIFIC_PRICE | ID_PRODUCT | |-------------------|------------| | 3 | 2 | | 7 | 3 |
Here is SQLFiddle demo
Solution 2
Try this:
CREATE TABLE ps_specific_price (
id_specific_price NUMBER,
id_product NUMBER
);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (1, 2);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (2, 2);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (3, 2);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (4, 3);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (5, 3);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (6, 3);
INSERT INTO ps_specific_price (id_specific_price, id_product) VALUES (7, 3);
COMMIT;
DELETE FROM ps_specific_price ps
WHERE ps.id_specific_price NOT IN (
SELECT MAX(id_specific_price)
FROM ps_specific_price ps_in
WHERE ps_in.id_product = ps.id_product
);
SELECT * FROM ps_specific_price;
ID_SPECIFIC_PRICE ID_PRODUCT
---------------------- ----------------------
3 2
7 3
You must connect the table from the inner query with the table from the outer one.
I'm using Oracle 11g R2. I checked this on SQLFiddle and my DELETE statement is invalid for MySQL - don't have that one installed and not much experience there, but you didn't say what database you are using.
Author by
popkutt
Updated on June 15, 2022Comments
-
popkutt almost 2 years
id_specific_price id_product ------------------------------- 1 2 2 2 3 2 4 3 5 3 6 3 7 3
Need to delete the duplicates, expected outcome:
id_specific_price id_product ------------------------------- 3 2 7 3
SELECT * FROM ps_specific_price WHERE id_specific_price NOT IN (SELECT MAX(id_specific_price) FROM ps_specific_price GROUP BY id_product)
works but
DELETE FROM ps_specific_price WHERE id_specific_price NOT IN (SELECT MAX(id_specific_price) FROM ps_specific_price GROUP BY id_product)
does not. There are plenty of examples to get around this but for some reason I am not able to adapt it. I believe it is GROUP BY. For example:
DELETE FROM ps_specific_price WHERE id_specific_price NOT IN (SELECT MAX(p.id_specific_price) FROM (SELECT * FROM ps_specific_price ) as p) GROUP BY id_product
Where did I go wrong here?
-
popkutt over 10 yearsThank you. I forgot to mention that there are thousands of rows, these are just first 7. So I need to find the duplicates first and then delete them
-
popkutt over 10 yearsThank you. It doesn't work on my MySQL database. What is Teradata database?
-
Przemyslaw Kruglej over 10 yearsYes, but you are "looking" for them using the subquery, no matter how many there are. Please tell us what database are you using. Also, you should check this link: stackoverflow.com/questions/18932/…
-
kishore krv over 10 yearsits a databases. If its on My Sql you could join id_product in the sub query with id_product from outer delete table.
-
wildplasser over 10 years@popkutt: you did not mention mysql (which is very restricted in references to the table being updated or deleted)
-
kishore krv over 10 yearsDELETE FROM ps_specific_price outT WHERE id_specific_price NOT IN (SELECT MAX(Subq.id_specific_price) FROM ps_specific_price subq WHERE outT.id_product=subq.id_product);Hope this clarifies.