Delete all rows except 100 most recent ones

10,382

Solution 1

You can use one of the following:

-- offset clause
WITH goners AS (
    SELECT *
    FROM Logs
    ORDER BY DateTime DESC
    OFFSET 100 ROWS 
)
DELETE FROM goners

-- numbered rows
WITH goners AS (
    SELECT ROW_NUMBER() OVER(ORDER BY DateTime DESC) AS rn, Logs.*
    FROM Logs
)
DELETE FROM goners
WHERE rn > 100

-- nth value
-- note that this "keeps" all rows that tie for last place
DELETE FROM Logs
WHERE DateTime < (
    SELECT MIN(DateTime)
    FROM (
        SELECT TOP 100 DateTime
        FROM Logs
        ORDER BY DateTime DESC
    ) AS x
)

Solution 2

While I agree with others that this is probably not the way to go, here's a way to do it anyway:

;WITH keepers AS
(   SELECT TOP 100 [DateTime]
    FROM dbo.Logs
    ORDER BY [DateTime] DESC )
DELETE FROM dbo.Logs a
WHERE NOT EXISTS ( SELECT 1 FROM keepers b WHERE b.[DateTime] = a.[DateTime] )

Solution 3

Instead of using NOT EXISTS, just use >=:

WITH keepers AS (
    SELECT TOP 100 [DateTime]
    FROM dbo.Logs
    ORDER BY [DateTime] DESC
   )
DELETE FROM dbo.Logs a
    WHERE l.DateTime < (SELECT MIN([DateTime]) FROM keepers);

I'm not sure if there are lock settings where new rows could be added in while the delete is running. If so, this would still be safe for that.

You can actually simplify this in SQL Server 2012+:

DELETE FROM dbo.Logs a
    WHERE l.DateTime < (SELECT [DateTime] 
                        FROM dbo.logs
                        ORDER BY [DateTime]
                        OFFSET 99 FETCH FIRST 1 ROW ONLY
                       );

Solution 4

This works for me:

;with cte as(select top(select count(*) - 100 from table) * from table order by dt)
delete from cte
Share:
10,382
סטנלי גרונן
Author by

סטנלי גרונן

Embedded C/C++ programmer. I recently started to use Qt5, very nice I think! I also program bare metals like Atmel AVR, ARM CortexM3, M4 and similar, with or without OS.

Updated on June 05, 2022

Comments

  • סטנלי גרונן
    סטנלי גרונן almost 2 years

    I am using MS SQL database and I have a table named "Logs" that looks like this:

    enter image description here

    Records are added here one after another so the table becomes quite big after few days/weeks.

    What I need to do periodically is a little bit of cleanup. I.e. I need an SQL query that would delete older rows and keep only the most recent 100 records in this table.

    I agree it would be better to have it "delete records older than some_date"... It was like this before, but the client wants it different :( So... here we are.

    BTW: I'm a little bit disappointed about people throwing negatives on this question. Is there something wrong with it or what?... Imagine: this question produced already 4 answers!!! ... and one guy decided to throw negative on that! Now I really don't know what to think... Strange people around here :(

  • Matt
    Matt almost 9 years
    Couldn't the latter work with the [DateTime] column as is too?
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    It doesn't work : it says "Msg 156, Level 15, State 1, Line 1 Incorrect syntax near the keyword 'NOT'."
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    Ok, it works after some minor corrections, yet I keep counting 101 records each time I execute the query...
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    Ok, it works after some minor corrections (missing parenthesis, etc.)
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    @SalmanA : What if there are more records with DateTime = MIN(DateTime) ...? I guess it is possible to have less than 100 rows after executing your query (the first one I'm talking about). Right?
  • Matt
    Matt almost 9 years
    Any duplicate [DateTime] values?
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    Hmmm, possible! Now it;s too late to check... I have run lots of queries on that table... it's messed up, but I suppose you're possibly right :)
  • Salman A
    Salman A almost 9 years
    @groenhen: yes, the first approach does not guarantee that exactly 100 records will be kept. It is possible to have more than 100 rows (since we are using <) when e.g. row #100, 101 and 102 have same datetime.
  • Salman A
    Salman A almost 9 years
    @groenhen see revised answer
  • סטנלי גרונן
    סטנלי גרונן almost 9 years
    Yet I think it wont work if at some moment count(*) is less than 100, i.e. selecting a NEGATIVE number of rows/records...
  • NitinSingh
    NitinSingh over 4 years
    Would this work when the amount of data is about 40GB, I inherited a huge data where maximum is a single log table never cleaned for 2yr. Now have to clean this and save only a month data, moving rest to archive (via another copy)
  • Gordon Linoff
    Gordon Linoff over 4 years
    @NitinSingh . . . Yes. You would want indexes to speed the queries and if many rows are being deleted, you might want to take a different approach.