How to efficiently delete rows while NOT using Truncate Table in a 500,000+ rows table
Solution 1
Calling DELETE FROM TableName
will do the entire delete in one large transaction. This is expensive.
Here is another option which will delete rows in batches :
deleteMore:
DELETE TOP(10000) Sales WHERE toDelete='1'
IF @@ROWCOUNT != 0
goto deleteMore
Solution 2
I'll leave my answer here, since I was able to test different approaches for mass delete and update (I had to update and then delete 125+mio rows, server has 16GB of RAM, Xeon E5-2680 @2.7GHz, SQL Server 2012).
TL;DR: always update/delete by primary key, never by any other condition. If you can't use PK directly, create a temp table and fill it with PK values and update/delete your table using that table. Use indexes for this.
I started with solution from above (by @Kevin Aenmey), but this approach turned out to be inappropriate, since my database was live and it handles a couple of hundred transactions per second and there was some blocking involved (there was an index for all there fields from condition, using WITH(ROWLOCK)
didn't change anything).
So, I added a WAITFOR
statement, which allowed database to process other transactions.
deleteMore:
WAITFOR DELAY '00:00:01'
DELETE TOP(1000) FROM MyTable WHERE Column1 = @Criteria1 AND Column2 = @Criteria2 AND Column3 = @Criteria3
IF @@ROWCOUNT != 0
goto deleteMore
This approach was able to process ~1.6mio rows/hour for updating and ~0,2mio rows/hour for deleting.
Turning to temp tables changed things quite a lot.
deleteMore:
SELECT TOP 10000 Id /* Id is the PK */
INTO #Temp
FROM MyTable WHERE Column1 = @Criteria1 AND Column2 = @Criteria2 AND Column3 = @Criteria3
DELETE MT
FROM MyTable MT
JOIN #Temp T ON T.Id = MT.Id
/* you can use IN operator, it doesn't change anything
DELETE FROM MyTable WHERE Id IN (SELECT Id FROM #Temp)
*/
IF @@ROWCOUNT > 0 BEGIN
DROP TABLE #Temp
WAITFOR DELAY '00:00:01'
goto deleteMore
END ELSE BEGIN
DROP TABLE #Temp
PRINT 'This is the end, my friend'
END
This solution processed ~25mio rows/hour for updating (15x faster) and ~2.2mio rows/hour for deleting (11x faster).
Solution 3
What you want is batch processing.
While (select Count(*) from sales where toDelete =1) >0
BEGIN
Delete from sales where SalesID in
(select top 1000 salesId from sales where toDelete = 1)
END
Of course you can experiment which is the best value to use for the batch, I've used from 500 - 50000 depending on the table. If you use cascade delete, you will probably need a smaller number as you have those child records to delete.
Solution 4
One way I have had to do this in the past is to have a stored procedure or script that deletes n records. Repeat until done.
DELETE TOP 1000 FROM Sales WHERE toDelete='1'
Solution 5
You should try to give it a ROWLOCK
hint so it will not lock the entire table. However, if you delete a lot of rows lock escalation will occur.
Also, make sure you have a non-clustered filtered index (only for 1 values) on the toDelete
column. If possible make it a bit column, not varchar (or what it is now).
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
Ultimately, you can try to iterate over the table and delete in chunks.
Updated
Since while loops and chunk deletes are the new pink here, I'll throw in my version too (combined with my previous answer):
SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
WHILE @@rowcount > 0
BEGIN
SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
END
Skippy Fastol
Programming enthusiast. Engineer since by first heartbeats.
Updated on September 14, 2021Comments
-
Skippy Fastol over 2 years
Let's say we have table Sales with 30 columns and 500,000 rows. I would like to delete 400,000 in the table (those where
"toDelete='1'"
).But I have a few constraints :
- the table is read / written "often" and I would not like a long "delete" to take a long time and lock the table for too long
- I need to skip the transaction log (like with a
TRUNCATE
) but while doing a"DELETE ... WHERE..."
(I need to put a condition), but haven't found any way to do this...
Any advice would be welcome to transform a
DELETE FROM Sales WHERE toDelete='1'
to something more partitioned & possibly transaction log free.