Drop table or truncate table in Cassandra, which is better

10,968

Solution 1

I think for almost all cases Truncate is a safer operation than a drop recreate. There have been several issues with dropping/recreating in the past with ghost data, schema disagreement, ect... Although there have been a number of fixes to try to make drop/recreate more stable, if its an operation you are performing every day Truncate should be much cheaper and more stable.

Solution 2

Drop table drops the table and all data. Truncate clears all data in the table, and by default creates a snapshot of the data (but not the schema). Efficiency wise, they're close - though truncate will create the snapshot. You can disable this by setting auto_snapshot to false in cassandra yaml config, but it is server wide. If it's not too much trouble, I'd drop and recreate table - but I've seen issues if you don't wait a while after drop before recreating.

Solution 3

Source : https://support.datastax.com/hc/en-us/articles/204226339-FAQ-How-to-drop-and-recreate-a-table-in-Cassandra-versions-older-than-2-1

NOTE: By default, snapshots are created when tables are dropped or truncated. This will need to be cleaned out manually to reclaim disk space.

Tested manually as well.

Truncate will keep the schema though, drop will not.

Solution 4

Beware!

From datastax documentation: https://docs.datastax.com/en/archived/cql/3.3/cql/cql_reference/cqlTruncate.html

Note: TRUNCATE sends a JMX command to all nodes, telling them to delete SSTables that hold the data from the specified table. If any of these nodes is down or doesn't respond, the command fails and outputs a message like the following: truncate cycling.user_activity; Unable to complete request: one or more nodes were unavailable.

Unfortunately, there is nothing on the documentation saying if DROP behaves differently

Share:
10,968
Ankur
Author by

Ankur

Updated on June 05, 2022

Comments

  • Ankur
    Ankur almost 2 years

    We have a use case where we need to re-create a table every day with current data in Cassandra. For this should we use drop table or truncate table, which would be efficient? We do not want the data to be backed up etc?

    Thanks Ankur

  • Ankur
    Ankur over 9 years
    Thanks a lot RussS! One question I have is if we use truncate will it create tombstones or remove the data immediately?
  • Ankur
    Ankur over 9 years
    Thanks a lot Ashic! As you mentioned we also face this issue that even after dropping the table when we try to re-create it will not let us do that for significant time ~5 mins or so.
  • ashic
    ashic over 9 years
    It's immediate. No tombstones.
  • ashic
    ashic over 9 years
    I recommended dropping as it being a one a day op, I imagined you might be able to afford a long delay between drop and create. If not, do a truncate, but remember to clear the snapshot, unless you disable auto snapshots.
  • Ankur
    Ankur over 9 years
    Thank you Ashic. I tried using truncate table but even after the cql completed successfully, I was able to see records in the table. Then I tried drop table and as you mentioned even that was not effective even after an hour. I am willing to use truncate but it doesn't seeem to be immediate.
  • ashic
    ashic over 9 years
    How much data are you dropping? Truncate should be immediate. Are there records, say five minutes after truncating? When u say drop wasn't effective, what do you mean?
  • Ankur
    Ankur over 9 years
    We have close to 4 billion rows but the last run was not successful so I am not really sure how many rows were in the file. I had waited for 15-20 mins before running the drop table. Also with drop table, I meant it was not immediate too. I waited for 30 mins after running drop table and still was not able to create the table, so I ran nodetool repair.
  • ashic
    ashic over 9 years
    What happens if you drop the table, run a repair on the cluster, and then create the table?
  • Ankur
    Ankur over 9 years
    That works but when I ran repair on my cluster for some reason some of my nodes went down. I had to re-start them.
  • ashic
    ashic over 9 years
    Is your OS forcing the process to die? It's likely the OOM killer.