What are inodes good for?

filesystems files inode hard-link

7,236

Solution 1

Hard links are besides the point. They are not the reason to have inodes. They're a byproduct: basically, any reasonable unix-like filesystem design (and even NTFS is close enough on this point) has hard links for free.

The inode is where all the metadata of a file is stored: its modification time, its permissions, and so on. It is also where the location of the file data on the disk is stored. This data has to be stored somewhere.

Storing the inode data inside the directory carries its own overhead. It makes the directory larger, so that obtaining a directory listing is slower. You save a seek for each file access, but each directory traversal (of which several are needed to access a file, one per directory on the file path) costs a little more. Most importantly, it makes it a lot more difficult to move a file from one directory to another: instead of moving only a pointer to the inode, you need to move all the metadata around.

Unix systems always allow you to rename or delete a file, even if a process has it open. (On some unix variants, make this almost always.) This is a very important property in practice: it means that an application cannot “hijack” a file. Renaming or removing the file doesn't affect the application, it can continue reading and writing to the file. If the file is deleted, the data remains around until no process has the file open anymore. This is facilitated by associating the process with the inode. The process cannot be associate with the file name since that may change or even disappear at any time.

Solution 2

You might have missed out on a few other factors. The separation of inode from file name in a directory enables hard links, but I doubt that hard links constitute the only, or even the original, motivation for that separation.

I have a copy of "The Bell System Technical Journal", July-August, 1978. This is one of their special issues, and it's titled "The Unix Time Sharing System". That's the place where Thompson, Ritchie and company published their description of version 6 and 7 Unix, and what you could do with it.

I see descriptions of inodes and the file system, but no motivations for the design. Ritchie and Thompson note that a create (their spelling in the BSTJ) system call makes the inode and sets the values, while open system calls fill in OS table(s) that can hold inode data on further file accesses.

One of the key paragraphs talks about the chunk of disk holding inodes, which they called the i-list:

The notion of the i-list is an unusual feature of UNIX. In practice, this method of organizing the file system has proved quite reliable and easy to deal with. ... It also permits a quite simple and rapid algorithm for checking the consistency of a file system, ... This algorithm is independent of the directory hierarchy, because it need only scan the linearly organized i-list.

— Source

The original designers found that keeping inodes separate from data made things more reliable.

I believe we can point to years of experience making that true. MS-DOS and other filesystems where the directory hierarchy is all mixed up with the allocation (FAT) are pretty fragile. If a sector in the FAT goes bad, things are really difficult to recover. But if a sector in a Unix directory goes bad, then the inode, somewhere else on the disk, is still around, with a link-count that indicates it belongs to some directory, and thus can be recovered.

It looks like maybe the obvious overhead of looking in a directory for an inode number, then looking in the inode for permissions or data, is compensated for by either making the OS handling of files simpler, or by greater reliability.

7,236

maaartinus

Updated on September 18, 2022

Comments

maaartinus over 1 year
I wonder if storing the information about files in inodes instead of directly in the directory is worth the additional overhead. It may be well that I'm overestimating the overhead or overlooking some important thing, but that's why I'm asking.

I see that something like "inodes" is necessary for hardlinks, but in case the overhead is really as big as I think, I wonder if any of the reasons justifies it:
- using hardlinks for backups is clever, but efficiency of backups is not important enough when compared to the efficiency of normal operations
- having neither speed nor size penalty for hardlinks can really matter, as this advantage holds only for the few files making use of hardlinks while the access to all files suffers the overhead
- saving some space for a couple of equally named binaries like bunzip2 and bcat is negligible
I'm not saying that inodes/hardlinks are bad or useless, but can it justify the cost of the extra indirection (caching helps surely a lot, but it's no silver bullet)?
- Kevin over 11 years
  
  Please explain, just how much do you think the overhead is?
djf over 11 years

you could delete such files simply by quoting the name, e.g. rm '~*'. You don't need inodes for that
user1146332 over 11 years

i think that is what single quotes are there for. You don't need to escape anything within single quotes.
maaartinus over 11 years

I've meant "file" in the sense of "file or directory or whatever". What you write is true, but I'd rather call it nuisance than usage.
G-Man Says 'Reinstate Monica' about 4 years

I’m not sure your penultimate paragraph really makes much sense. If a sector in the i-list goes bad, you’ve lost several files. Sure, you might know the names of the files that have been lost; how helpful is that?
Admin about 4 years

@G-ManSays'ReinstateMonica' - You've got a good point. But the original FAT filesystem was prone to getting trashed. Got a better explanation?
G-Man Says 'Reinstate Monica' about 4 years

Not per se, but the early Unix systems (e.g., version 6, in the mid-late 1970s) were notorious for suffering filesystem damage whenever the system crashed (whether because of an internal error (panic) or losing power without being shut down), and yet the Unix v6 filesystem was fundamentally similar to later versions — specifically, it had directories that were just lists of filenames and inode numbers, and inodes that were separate from data blocks.   … (Cont’d)
G-Man Says 'Reinstate Monica' about 4 years

(Cont’d) … Things got better when they changed the filesystem code to do disk writes in a more intelligent order — e.g., remove a block from the free list before allocating it to an inode; allocate and write an inode before writing a directory entry that points to it — and made incremental changes to the filesystem structure — e.g., storing multiple copies of the superblock.   Maybe the FAT filesystem was just coded sloppily, and had more subtle design flaws.