what is the best way to delete millions of records in TSQL?
Solution 1
do it in batches of 5000 or 10000 instead if you need to delete less than 40% of the data, if you need more then dump what you want to keep in another table/bcp out, truncate this table and insert those rows you dumped in the other table again/bcp in
while @@rowcount > 0
begin
Delete Top (5000)
From Table1 A
Left Join Table2 B
on A.Name ='XYZ' and
B.sId = A.sId
Left Join Table3 C
on A.Name = 'XYZ' and
C.sId = A.sId
end
Small example you can run to see what happens
CREATE TABLE #test(id INT)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
WHILE @@rowcount > 0
BEGIN
DELETE TOP (2) FROM #test
END
Solution 2
One way to remove millions of records is to select the remaining records in new tables then drop the old tables and rename the new ones. You can choose the best way for you depending on the foreign keys you can eithe drop and recreate the foreign keys or truncate the data in the old tables and copy the selected data back.
If you need to delete just few records disregard this answer. This is if you actually want to DELETE millions of records.
Solution 3
One other method is to insert the data that you want to keep into another table say Table1_good. Once the is completed and verified: Drop Table1 then Rename Table1_good to Table1
Dirty way to do it but it works.
Solution 4
Using the top clause is more for improving concurrency and may actually make the code run slower.
One suggestion is to delete the data from a derived table: http://sqlblogcasts.com/blogs/simons/archive/2009/05/22/DELETE-TOP-x-rows-avoiding-a-table-scan.aspx
David
Updated on June 19, 2022Comments
-
David about 2 years
I have a following table structre
Table1 Table2 Table3 -------------------------------- sId sId sId name x y x1 x2 x3
I want to remove all records from table1 that do not have a matching record in the table3 based on sId and if sId present in table2 then do not delete record from table1.Ther are about 20,15 and 10 millions records in table1,table2 & table3 resp. --I have done something like this
Delete Top (3000000) From Table1 A Left Join Table2 B on A.Name ='XYZ' and B.sId = A.sId Left Join Table3 C on A.Name = 'XYZ' and C.sId = A.sId
((I have added index on sId But not on Name.)) But This takes a long time to remove records. Is there any better way to delete millions records? Thanks in advance.