Cassandra nodes disk space usage above 90%

7,970

Solution 1

Think he said deleted data from his one and only keyspace, and thus now got tombstones, try to see here beware about doing manual/major compaction and possible needed free head room on disk. If you've filled your disk too much to let compaction handle this, maybe remove one node, wipe it, and bootstrap it again and do this node by node around your cluster. Consider always leaving head room to do compaction and handle node failures, ie. don't fill your nodes too much (maybe less than 50-75% disk usage).

Solution 2

Warning: If you have any legitimate snapshots of keyspaces, this will clear those out as well (so you'll want to back those up).

Go on each of the nodes and call in a terminal:

nodetool clearsnapshot

After deleting a keyspace, Cassandra still keeps the data around until it's called to clear it out explicitly.

http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsClearSnapShot.html

Share:
7,970

Related videos on Youtube

Sachin PK
Author by

Sachin PK

Updated on September 18, 2022

Comments

  • Sachin PK
    Sachin PK over 1 year

    Hi I'm running a 5 node dse cassandra cluster. Every node is about 90 % disk usage ,so I've deleted data from my keyspace(I've only one keyspace).but my disk space is still 90 % .Is there anyway to regain disk space of deleted data ??

    • LHWizard
      LHWizard almost 8 years
      You shouldn't go over 50% disk usage, since some streaming operations including compactions will re-write potentially all of the data. Of course this all depends on your particular setup and compaction strategy. You can try to do a compaction on the smaller keyspaces. Check out this StackOverflow question
  • Alexis Wilke
    Alexis Wilke over 5 years
    You may want to clearly specify that after the wipe + re-add you need to wait for the node to be 100% up to date... You wouldn't want someone to wipe out their other nodes too soon and lose data! It also supposes that you have a replication factor of 2 or more (also I would hope people never use less than 3 as a replication factor!)
  • Alexis Wilke
    Alexis Wilke over 5 years
    This is true by default, although you can turn off that feature by editing the Cassandra settings (cassandra.yaml) and setting autosnapshot: false. I personally always do that because the replication is what ensures my data is safe. If I delete by accident on a production server, I probably need heavy medical attention...