Are soft deletes a good idea?

107,577

Solution 1

I say it's a bad idea, generally (with some exceptions, perhaps).

First, your database should be backed up regularly, so you should never be in a situation where you would lose data permanently because of a DELETE (unless it's a deletion of just-added data, of course).

Second, a soft delete like this means you now have to include a WHERE IsDeleted = false clause in every query on this table (and so much worse if you're JOINing these tables). A mistake here would be caught as soon as a user or tester noticed a deleted record showing up again, which might take some time. Also, it would be easy for a developer to omit the WHERE clause from COUNT(*) queries, which might take even longer to discover (I worked on one project where this had been happening for years; not many records were ever "deleted", so the totals were close to what was expected and no one noticed).

Finally, a soft delete will work on a table with artificial keys, but potentially won't work on a table with a natural primary key (e.g. you "delete" someone from a table keyed by Social Security Number - what do you do when you need to add him back? Please don't say "include IsDeleted in a compound primary key".).

In a design review, I would expect the developer to demonstrate an awareness of the costs and benefits and to present an excellent reason for doing soft deletes in this manner. "Why not do it?" is not an excellent reason.

Solution 2

It's never a bad idea to avoid potential data loss.

I always soft-delete. In cases where the database needs to be scrubbed of one or more records, I generally employ either a two-step process of soft deletion and then emptying a "recycle bin" of records, or a document-management-style approach where document records can be aged away, and then go through an approval process prior to hard deletion.

Solution 3

It depends on the circumstances. I could see situations where you are legally required to truly delete something. Maybe someone has requested that their social security number be permanently removed from your system. Or maybe you have a duplicate record that you want to consolidate into a single record. Keeping the duplicate hanging around with a deleted flag might not be advantageous.

There is also one technical disadvantage: You can't do cascading deletions, which automatically clear out any references to the deleted data to prevent foreign key violations. This isn't necessarily a big issue, but it's something to keep in mind.

Otherwise, I think it's a good idea.

Solution 4

If you're going to use soft deletion, it's a good idea to have a deleted_date field, instead of an is_deleted field. You get a nice piece of extra data instead of just the bit field.

Solution 5

One of the major problem for soft delete is those unwanted data will potentially affects the db performance. Several years ago one of my Client requested me to do soft delete on all database items, my solution to that is to move all "deleted" items to a backup table, instead of leaving it to the current running tables.

Share:
107,577
001
Author by

001

Only questions with complete answers are accepted as solutions.

Updated on July 08, 2022

Comments

  • 001
    001 almost 2 years

    Are soft deletes a good idea or a bad idea?

    Instead of actually deleting a record in your database, you would just flag it as IsDeleted = true, and upon recovery of the record you could just flag it as False.

    Is this a good idea?

    Is it a better idea to physically delete the record, then move it to an archive database, and if the user wants the record back, then software will look for the record in the archive and recreate it?

  • hansvb
    hansvb about 14 years
    And some data MUST be deleted due to legal requirements.
  • Paul Kohler
    Paul Kohler about 14 years
    The system often still needs the data for integrity, auditing or history of change... soft delete away! Use cleanup processes for real deletions etc... PK :-)
  • scunliffe
    scunliffe about 14 years
    +1 for moving the data to another table!
  • armandino
    armandino about 14 years
    Good point. Haven't thought of that one.
  • MusiGenesis
    MusiGenesis about 14 years
    Good point re: cascading deletes.
  • Michael Petito
    Michael Petito about 14 years
    You might instead partition the table by "IsDeleted" to mitigate performance implications if its possible to do so in your db. Then you don't have to worry about two different tables.
  • KMån
    KMån about 14 years
    +1: I always soft-delete.
  • aehiilrs
    aehiilrs about 14 years
    sigh A database account used by an application? That would be lovely.
  • Andreas Bonini
    Andreas Bonini about 14 years
    Not that it would be useful for anything except human mistakes.. If I'm a malicious user and gain access to it I can still "DELETE" them by doing UPDATE table SET Field1 = 0, Field2 = 0, ... WHERE 1
  • Josh Smeaton
    Josh Smeaton about 14 years
    It's fairly trivial to create a view and materialize it if need be, that excludes all is_deleted records. All queries can then be run against the view. As far as recovering from a backup - say for something like re-enabling a user account - you really don't want to go running to the archives to restore their data. You'd like to set is_deleted = false. This is a fairly primitive (and unlikely) scenario as you'd probably use a disabled field - but the point is the same. For some cases, to be able to recover the deleted data instantly could be worthwhile.
  • MusiGenesis
    MusiGenesis about 14 years
    @Josh: everything in SQL programming is fairly trivial; it's the accumulation of lots of trivial things that starts to cause problems. As I said in my last sentence, I would expect a developer to have a good reason for increasing the complexity of the system in this way. I'm disappointed but not at all surprised to see so many developers say they do this as a matter of course, regardless of the actual needs of their application.
  • MusiGenesis
    MusiGenesis about 14 years
    And some data MUST be deleted due to illegal requirements. :)
  • Stephanie Page
    Stephanie Page about 14 years
    @MG, "unless it's a deletion of just-added data, of course", Your RDBMS doesn't have a transaction log? you can't do point in time recovery?
  • Stephanie Page
    Stephanie Page about 14 years
    Let me support MG. You have a timestamp field. You frequently range scan it via an index... everything between sysdate -2 and sysdate. If you soft delete frequently that range scan could return a high percentage of rows that will have to be filtered out. Filtering = expensive, range scan = cheap.
  • Stephanie Page
    Stephanie Page about 14 years
    @Josh, if you're going to materialize the view, why not just make it a table from the state. Mviews are a good way to fix something that's broken - I'd hesitate to design that way from the start.
  • Josh Smeaton
    Josh Smeaton about 14 years
    @steph I was explicitly referring to coding against the exclusion of is-deleted rows. It'd be much nicer to not have to remember to exclude the column in every where clause. In general I agree with you though.
  • MusiGenesis
    MusiGenesis about 14 years
    Wouldn't it be easier to prevent cascading deletes by un-checking the "CASCADE DELETES" box (or however it's done in your database of choice)?
  • Keith Williams
    Keith Williams about 14 years
    +1 for using DateTime fields instead of Bits... the number of old systems I have to manage with an "IsSomething" and "DateSomethinged" is annoyingly high...
  • 001
    001 about 14 years
    Thats a very interesting feature, thanks for that.
  • MusiGenesis
    MusiGenesis over 13 years
    @UpTheCreek: I think DanM's point was that if you're doing soft-deletes, you have to handle the cascading deletions yourself. If you're using hard-deletes (what I would call "normal") you can let the DB automatically do the cascading deletes itself.
  • Vincent
    Vincent almost 13 years
    How do you deal with Report when you are using the hard delete. E.g. a common eCommerce site, once product has been deleted, how are you going to display the historical order report?
  • Neil McGuigan
    Neil McGuigan over 12 years
    so you don't have to check for where deleted=0 on every query, you can make a view for the table that does that for you. you can then make a view called recycle_bin or similar that shows only deleted records (doing a union on tables that do soft deletes on their common fields)
  • Travis J
    Travis J about 12 years
    @JoshSmeaton - This is why most systems partition their tables on the deletion flag.
  • Travis J
    Travis J about 12 years
    @MusiGenesis - And what about cascade restores?
  • Josh Smeaton
    Josh Smeaton about 12 years
    @TravisJ I no longer agree with the concept of soft deletes in the majority of cases. 'Just incase' isn't a good reason for soft deletes I think.
  • Paul Fleming
    Paul Fleming about 11 years
    Isn't this more of a "suspend" than a "delete"? So I would argue that "suspended" records is appropriate when you need historic data and "deleting" records is appropriate when you do not need historic data (or means to easily reverse a delete).
  • StartupGuy
    StartupGuy about 11 years
    That is a VERY presumptive answer. Without having all details for every situation how can you conclude that there is absolutely never any reason for using soft-deletes? Never say never. It shows you lack imagination.
  • StartupGuy
    StartupGuy about 11 years
    Very true. I sell the idea in terms of it's actual implementation and may name the fields "hidden_on" (timestamp) and "hidden_by" (user id). This is because the record is just supposed to be hidden from the end-user but not the administrators. For administration all records must still be visible as per normal except that records that are "hidden" (deleted) for users would be marked as such to the admins. If I had an archive table for every standard table in my DB my schema would be twice as large. Yuk.
  • StartupGuy
    StartupGuy about 11 years
    +1 for starting with "It depends.."
  • StartupGuy
    StartupGuy about 11 years
    +1 for using a safe, hybrid, 2-step process
  • rjha94
    rjha94 about 11 years
    And without sharing your details - are you not making assumptions? soft-deletes are not good and I stand by my statement. share your case and I will point out your flaws.
  • StartupGuy
    StartupGuy almost 11 years
    Mmmm, I would say the popular consensus on this page, shows that you stand alone. Not only do people here document different scenarios for where it may be employed, but also practical ways to make it useful. If you choose to suggest that all these people require a "brain checkup", that is your prerogative and it reflects on you.
  • rjha94
    rjha94 almost 11 years
    That is fine. Life is about taking stand and validating your stand with evidence. It is not always about going with the majority. Sorry, I do not want to take any flame baits. I have seen people writing queries doing FTS and even that works! I guess all depends on where you set your bar in life. If the bar is not high, anything works.
  • Danny Beckett
    Danny Beckett almost 11 years
    Also worth mentioning: a UNIQUE index in combination with soft deletes can make for a mess. E.g. trying to register a deleted username will fail, even though it is not in use.
  • Brandon Wittwer
    Brandon Wittwer over 10 years
    We have recently been asked to create the isDeleted bit as a column directly in the table... requiring it's value to be managed. We previously had this bit in the view, calculated based on the business rule of deleteDate IS NOT NULL and deleteDate < Getdate(). Now I'm begrudgingly complying.
  • Josh Smeaton
    Josh Smeaton over 10 years
    @BrandonWittwer, why? If you had a first_name column, and a last_name column, would you also have a full_name column just because the application wanted had a full_name attribute, or would you compute it in the app? By the way, the deleteDate < Getdate() should be redundant. Also, without knowing your backend, perhaps you could implement a computed column so you don't have to manage maintaining the integrity of two separate columns yourself. technet.microsoft.com/en-us/library/ms191250(v=sql.105).aspx
  • Nicolai Shestakov
    Nicolai Shestakov over 10 years
    +1 for not using soft deletes unless you really have reasons to do that. What is not mentioned here is that referential integrity stops working the right way with soft deleted rows. If you mark row as deleted you should mark every child rows as deleted as well and if you don't do that DBMS will not prevent such logical deletion. So "not having problems with cascade delete" is not a case here. Why don't you remove the whole FK then? It would also allow you to delete rows without checking for the referenced records.
  • Nicolai Shestakov
    Nicolai Shestakov over 10 years
    @Vincent, if you need to use deleted data in your app, you're not dealing with delete operation, it has another business meaning. In case "IsDeleted" is used only for archiving purpose, I would use separate table. And if you are worried about possible data loss on wrong deletes, what about wrong updates? Maybe you should consider using fully functional audit in this case?
  • Stefan Steiger
    Stefan Steiger about 10 years
    And that's exactly why it doesn't work. You're assuming every action is an insert. But what about updates ? They won't be audited, and you can't see who performed either the update or the insert this way. Soft deletes are a cheap (and disfunctional) way to not have to put in the work to build a working revision control.
  • Erran Morad
    Erran Morad about 10 years
    @MusiGenesis - are there any examples of where soft-delete can be disastrous ?
  • Jon Davis
    Jon Davis almost 10 years
    I agree with the notion that "it depends" and "by default, don't soft-delete". There are other strategies to consider, one of which is replication to a soft-deleted archive table updated via triggers. The biggest reasons NOT to soft-delete are performance (filtering), storage space (grow-only behavior), noise (out of 2,423,023 records, only two records are legit), and hassle (setting up views, always being wary to avoid hard deletes, etc). Hard deleting makes life easier and simpler. Not to mention cascade delete feature is built into SQL Server which can make life even easier.
  • Brandon Wittwer
    Brandon Wittwer almost 10 years
    @JoshSmeaton, you've exactly framed my frustration. Data which can be constituted entirely from other data in the same record should DEFINITELY be part of a view. If materialization matters (as a performance consideration), a computed column may be considered. Of course, in this case, a computed isn't possible, because the getDate() in there...non-deterministic.
  • Brandon Wittwer
    Brandon Wittwer almost 10 years
    and technically SomeStatusesDateField < GetDate() is necessary when you can postdate the effect... like PublishDate = three weeks from now... isPublished is false until then.
  • Wanny Miarelli
    Wanny Miarelli about 9 years
    Hard delete is not the only way to go. There are scenarios where you cant just hard-delete the records. Sometimes i need a reference of the deleted records ( linked inside another table ) even if it has been deleted, this is done by soft-deleting it. Moving the records inside " Archive table " is not as simple as you describe it, it will not stop at a simple move from a to b.
  • malhal
    malhal almost 9 years
    4A is a very, very good point
  • rjha94
    rjha94 almost 9 years
    At one point, popular consensus was that Sun goes around Earth and people were burnt on stake for saying otherwise. if you really think soft deletes are a good idea then you have not managed any complex databases over a period of time. whatever you are trying to achieve by doing soft-deletes can be achieved w/o soft delete too in a clean way. So far people have just whined about being presumptive but no one has s upplied any data. Soft deletes are Lazy solution to a problem that should be done with archiving. why is the data hanging around in this table if you don't need it?
  • Zano
    Zano over 8 years
    I really didn't expect the Galileo gambit on Stack Overflow, but hey, here it is.
  • rjha94
    rjha94 over 8 years
    @Zano if people count popular votes and don;t supply any rational argument then it is a Galileo-isque situation only. However I don;t have to recant because puny stack overflow users don;t have Pope;s influence. soft-deletes are a bad idea. The right way is archiving.
  • Primoz Rome
    Primoz Rome over 7 years
    How do you handle relations of the soft-deleted data? For example you delete a record in company_departments table. Then you have users table and there is a FK department_id in there. Would you in this case update users table and set department_id FK to null where it matches the soft deleted department?
  • Yinda Yin
    Yinda Yin over 7 years
    @PrimozRome: Soft deletes do not require that. They only require that you set the flag. If you want to purge older records, you do it in the usual way.
  • Primoz Rome
    Primoz Rome over 7 years
    @RobertHarvey yeah I understand that, but if I have soft deleted a record ant that record's ID is a foreign key to some other table, that table needs to update that foreign key right?
  • BCA
    BCA over 7 years
    @rjha94: perhaps I'm missing something, but aren't hard deletes impossible in some (if not many) scenarios? You stated that "whatever you are trying to achieve by doing soft-deletes can be achieved w/o soft delete too in a clean way", but have not given any example on how to deal with the classic case of deleting a Customer when there is a Transactions table with many records pointing back to it. You can't just cascade delete all the transactions that customer made. Nor can you hard delete invoices because a product is hard-deleted. So, what is the clean solution here?