How does rm work? What does rm do?

12,315

Solution 1

My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content.

Be careful with your terminology. The files (and pointers) are on disk, not in memory (RAM).

If you 'rm' a file, you certainly must be deleting that pointer.

Yes. What happens is heavily file-system dependent. Some have a bitmap of which block are free/busy. So it would have to flip the bit for each block freed. Other filesystems use more sophisticated methods of tracking free space.

which suggests that nothing is actually being overwritten...

Correct. You can find various "undelete" utilities. But depending on the filesystem, it can get rather complex. But stuff you saved years ago could still be sitting around -- or it could be overwritten. It all depends on minute details. For example, see e2fsprogs.

So, is deleting the pointer to a memory address the only thing rm does?

Well, it also has to remove the "directory entry" that gives metadata about the file. (Sometimes it just wipes out the first byte of the filename).

Is the data still sitting there in a contiguous block like it was before?

Yes, the data is still there. But don't assume it is a contiguous block. Files can be freagmented all over the disk, with lots of pointers that tell it how to re-assemble. And if you are using RAID, things get real complex.

Solution 2

Yes. rm simply deletes the pointer. If you have multiple pointers to the file (hard links), then deleting one of those pointers with rm leaves the others completely untouched and the data still available.

Deleting all of those links still does not touch the data, however the OS is now free to reuse the blocks which previously were reserved for storing that data.

It's worth noting that any process which opens a file creates a file handle for it. This adds to the overall count of references to the file. If you delete all of the pointers from your filesystem, but the operating system still has a process running with an open file handle for your file, then the count of pointers will not be zero and the file will not really be deleted. Only when that final pointer is closed will the filesystem register the disk space as having been released, and only at that point will the OS be free to overwrite the blocks previously reserved for storing that data.

You may or may not be able to recover that data at any point in the future depending on whether any reuse of the blocks in question has occurred.

Incidentally, you have no guarantee that your data is sitting there in a contiguous block in the first place.

Share:
12,315
DilithiumMatrix
Author by

DilithiumMatrix

Updated on June 06, 2022

Comments

  • DilithiumMatrix
    DilithiumMatrix almost 2 years

    My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content. If you 'rm' a file, you certainly must be deleting that pointer. If rm actually 'erases' the data, I would guess each bit is written over (set to 0 or something). But I know that there are special procedures/programs (i.e. srm) to make sure that data isn't 'recoverable' --- which suggests that nothing is actually being overwritten...

    So, is deleting the pointer to a memory address the only thing rm does? Is the data still sitting there in a contiguous block like it was before?

  • DilithiumMatrix
    DilithiumMatrix about 10 years
    Interesting. How does the OS track if there are any active pointers to the block?
  • Jon
    Jon about 10 years
    Using inodes. This is all handled by the filesystem. If you create a file (cat 'thing' > testfile) and use ls -i testfile you'll see the inode responsible for pointing to the data. If you then create a hardlink with another name (ln testfile anothertest) you'll see that they share the same inode (ls -i anothertest).
  • jlliagre
    jlliagre about 10 years
    You should add the the OS is free to reuse the blocks only if no process still has an open file pointer on the now fully unlinked file.
  • Jon
    Jon about 10 years
    Good point, that's probably worth clarifying since it's not obvious at first.