What is the "directory order" of files in a directory (used by `ls -U`)?

17,145

Solution 1

It depends on the filesystem. For some filesystems (ext3 among them), a directory is actually a file with a well-known format, and the 'd' bit set in its permissions or mode. In that case, the history of what length filenames have gotten created and deleted can matter. The kernel will fill in the first entry in the directory file that has enough room to hold the new file's name. See http://e2fsprogs.sourceforge.net/ext2intro.html for more detail, the section titled "Physical Description".

For some other filesystems, Reiserfs among them, a directory is actually just some entries in a B+ tree that's not visible in the filesystem, so a plain ls of a directory in a Reiserfs filesystem is in lexical order.

Solution 2

Indeed, there is no specific order to expect. It's up to the OS and file system implementation to order the entries the way it likes. One goal of this option is to get the faster listing possible which can be a significant factor with very large directories.

Solution 3

It is the order that the entries are stored internally in the filesystem. This will vary from filesystem to filesystem. For instance, the entires may be stored in some kind of balanced tree, such as a Red-Black Tree. There may be further optimizations for directories with a small number of entries, or to deal efficiently with additions and removals.

Share:
17,145
Stefan
Author by

Stefan

Updated on September 18, 2022

Comments

  • Stefan
    Stefan over 1 year

    According to the man page for ls, ls -U means:

    do not sort; list entries in directory order.

    What does "directory order" mean, and how is it determined?

    The following test (executed on an ext3 file system), shows that it is not the order in which the files were created:

    root@sv1010vm0007:/tmp# mkdir test
    root@sv1010vm0007:/tmp# touch test/2
    root@sv1010vm0007:/tmp# touch test/1
    root@sv1010vm0007:/tmp# touch test/3
    root@sv1010vm0007:/tmp# ls -U test
    2  3  1
    
  • LawrenceC
    LawrenceC almost 13 years
    There are scheduling/caching algorithms in the kernel and filesystem drivers that influence when exactly data is written to disk. This is done to increase performance. Because of this optimization you can't really tell exactly when writes will happen. Also, old inodes in filesystems might be reused so new files can appear in directory slots where old files were. So the order of creation isn't necessary "directory order."
  • Alen Milakovic
    Alen Milakovic almost 13 years
    @Bruce: So, what are the contents of this "directory" file exactly?
  • Admin
    Admin almost 13 years
    Traditionally, something very much like struct dirent, which I find defined in /usr/include/bits/dirent.h on a RHEL box, and on a Slackware 11.0 box. Both of those machines refuse to open() a directory directly. I know that I used to do things like "cat . > dot.as.file" to convert a directory to a regular file, the last time I did it for sure was Solaris 8, I think. Basically, struct dirent contains an inode (a number), a record length, a name length, and a string, which as I recall, may or may not be ASCII-Nul terminated.
  • Alen Milakovic
    Alen Milakovic almost 13 years
    @Bruce: Ok. And this C struct manages to get all the information about the files and subdirectories it contains into those fields?
  • Admin
    Admin almost 13 years
    A directory in filesystems like ext2, or BSD FFS or the original Unix filesystem, just contained a list of inode numbers and corresponding file names. Invoking "ls" without arguments just earned you a list of file names. If you did "ls -l", "ls" itself would look up every file name by doing a stat(2) system call on the file name, and from the struct stat, get permissions, size, "file type", etc. So, no, a directory file doesn't have all the information, just a list of names.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' almost 13 years
    @Bruce: ext3/ext4 (with the dir_index option) also uses a B-tree variant to store directories. Several unices, including (at least some versions of) Solaris and *BSD, have opendir(3) call open(2), but on Linux, as far as I can remember, open(2) has refused to open directories.
  • Alen Milakovic
    Alen Milakovic almost 13 years
    @Bruce: thanks for the explanation. One often forgets what details goes into these low level implementations.
  • jlliagre
    jlliagre almost 13 years
    @ultrasawblade: Not sure why you wrote that as a comment to my own reply instead of a reply by itself or whatever.
  • Villemoes
    Villemoes over 11 years
    @Gilles: It is not a problem to open(2) a directory, so long as one only uses O_RDONLY. This can be useful to get an fd which can be passed to the *at function family (openat, fstatat etc.). But read(2) on that fd will fail with EISDIR.
  • Martin Dorey
    Martin Dorey about 11 years
    ext2.sourceforge.net/2005-ols/paper-html/node3.html explains that the dir_index feature hashes the file name and a file system-specific secret. dump2efs includes dir_index in its Filesystem features line if the feature is enabled.
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    @ultrasawblade: Strictly speaking, "old inodes [being] reused" has nothing to do with reuse of old directory slots.  This can happen whenever old directory entries are unlinked; if they are all hard links, this needn't result in inodes being deallocated.
  • flygge
    flygge over 3 years
    The best way to think about "directory order" is really "the most efficient order in which to list the files". This order will differ depending a wide variety of factors and should not be relied upon without specific guarantees from the filesystem you are using.