understanding and decoding the file mode value from stat function output

18,425

Solution 1

The mode from your question corresponds to a regular file with 644 permissions (read-write for the owner and read-only for everyone else), but don’t take my word for it.

$ touch foo
$ chmod 644 foo
$ perl -le 'print +(stat "foo")[2]'
33188

The value of $mode can be viewed as a decimal integer, but doing so is not particularly enlightening. Seeing the octal representation gives something a bit more familiar.

$ perl -e 'printf "%o\n", (stat "foo")[2]'
100644

Bitwise AND with 07777 gives the last twelve bits of a number’s binary representation. With a Unix mode, this operation gives the permission or mode bits and discards any type information.

$ perl -e 'printf "%d\n", (stat "foo")[2] & 07777'  # decimal, not useful
420
$ perl -e 'printf "%o\n", (stat "foo")[2] & 07777'  # octal, eureka!
644

A nicer way to do this is below. Read on for all the details.


Mode Bits

The third element returned from stat (which corresponds to st_mode in struct stat) is a bit field where the different bit positions are binary flags.

For example, one bit in st_mode POSIX names S_IWUSR. A file or directory whose mode has this bit set is writable by its owner. A related bit is S_IROTH that when set means other users (i.e., neither the owner nor in the group) may read that particular file or directory.

The perlfunc documentation for stat gives the names of commonly available mode bits. We can examine their values.

#! /usr/bin/env perl

use strict;
use warnings;
use Fcntl ':mode';

my $perldoc_f_stat = q(
  # Permissions: read, write, execute, for user, group, others.
  S_IRWXU S_IRUSR S_IWUSR S_IXUSR
  S_IRWXG S_IRGRP S_IWGRP S_IXGRP
  S_IRWXO S_IROTH S_IWOTH S_IXOTH

  # Setuid/Setgid/Stickiness/SaveText.
  # Note that the exact meaning of these is system dependent.
  S_ISUID S_ISGID S_ISVTX S_ISTXT

  # File types.  Not necessarily all are available on your system.
  S_IFREG S_IFDIR S_IFLNK S_IFBLK S_IFCHR S_IFIFO S_IFSOCK S_IFWHT S_ENFMT
);

my %mask;
foreach my $sym ($perldoc_f_stat =~ /\b(S_I\w+)\b/g) {
  my $val = eval { no strict 'refs'; &$sym() };
  if (defined $val) {
    $mask{$sym} = $val;
  }
  else {
    printf "%-10s - undefined\n", $sym;
  }
}

my @descending = sort { $mask{$b} <=> $mask{$a} } keys %mask;
printf "%-10s - %9o\n", $_, $mask{$_} for @descending;

On Red Hat Enterprise Linux and other operating systems in the System V family, the output of the above program will be

S_ISTXT    - undefined
S_IFWHT    - undefined
S_IFSOCK   -    140000
S_IFLNK    -    120000
S_IFREG    -    100000
S_IFBLK    -     60000
S_IFDIR    -     40000
S_IFCHR    -     20000
S_IFIFO    -     10000
S_ISUID    -      4000
S_ISGID    -      2000
S_ISVTX    -      1000
S_IRWXU    -       700
S_IRUSR    -       400
S_IWUSR    -       200
S_IXUSR    -       100
S_IRWXG    -        70
S_IRGRP    -        40
S_IWGRP    -        20
S_IXGRP    -        10
S_IRWXO    -         7
S_IROTH    -         4
S_IWOTH    -         2
S_IXOTH    -         1

Bit twiddling

The numbers above are octal (base 8), so any given digit must be 0-7 and has place value 8n, where n is the zero-based number of places to the left of the radix point. To see how they map to bits, octal has the convenient property that each digit corresponds to three bits. Four, two, and 1 are all exact powers of two, so in binary, they are 100, 10, and 1 respectively. Seven (= 4 + 2 + 1) in binary is 111, so then 708 is 1110002. The latter example shows how converting back and forth is straightforward.

With a bit field, you don’t care exactly what the value of a bit in that position is but whether it is zero or non-zero, so

if ($mode & $mask) {

tests whether any bit in $mode corresponding to $mask is set. For a simple example, given the 4-bit integer 1011 and a mask 0100, their bitwise AND is

  1011
& 0100
------
  0000

So the bit in that position is clear—as opposed to a mask of, say, 0010 or 1100.

Clearing the most significant bit of 1011 looks like

    1011      1011
& ~(1000) = & 0111
            ------
              0011

Recall that ~ in Perl is bitwise complement.

For completeness, set a bit with bitwise OR as in

$bits |= $mask;

Octal and file permissions

An octal digit’s direct mapping to three bits is convenient for Unix permissions because they come in groups of three. For example, the permissions for the program that produced the output above are

-rwxr-xr-x 1 gbacon users 1096 Feb 24 20:34 modebits

That is, the owner may read, write, and execute; but everyone else may read and execute. In octal, this is 755—a compact shorthand. In terms of the table above, the set bits in the mode are

  • S_IRUSR
  • S_IWUSR
  • S_IXUSR
  • S_IRGRP
  • S_IXGRP
  • S_IROTH
  • S_IXOTH

We can decompose the mode from your question by adding a few lines to the program above.

my $mode = 33188;
print "\nBits set in mode $mode:\n";
foreach my $sym (@descending) {
    if (($mode & $mask{$sym}) == $mask{$sym}) {
        print "  - $sym\n";
        $mode &= ~$mask{$sym};
    }
}

printf "extra bits: %o\n", $mode if $mode;

The mode test has to be more careful because some of the masks are shorthand for multiple bits. Testing that we get the exact mask back avoids false positives when some of the bits are set but not all.

The loop also clears the bits from all detected hits so at the end we can check that we have accounted for each bit. The output is

Bits set in mode 33188:
  - S_IFREG
  - S_IRUSR
  - S_IWUSR
  - S_IRGRP
  - S_IROTH

No extra warning, so we got everything.

That magic 07777

Converting 77778 to binary gives 0b111_111_111_111. Recall that 78 is 1112, and four 7s correspond to 4×3 ones. This mask is useful for selecting the set bits in the last twelve. Looking back at the bit masks we generated earlier

S_ISUID    -      4000
S_ISGID    -      2000
S_ISVTX    -      1000
S_IRWXU    -       700
S_IRWXG    -        70
S_IRWXO    -         7

we see that the last 9 bits are the permissions for user, group, and other. The three bits preceding those are the setuid, setgroupid, and what is sometimes called the sticky bit. For example, the full mode of sendmail on my system is -rwxr-sr-x or 3428510. The bitwise AND works out to be

  (dec)      (oct)                (bin)
  34285     102755     1000010111101101
&  4095 = &   7777 = &     111111111111
-------   --------   ------------------
   1517 =     2755 =        10111101101

The high bit in the mode that gets discarded is S_IFREG, the indicator that it is a regular file. Notice how much clearer the mode expressed in octal is when compared with the same information in decimal or binary.

The stat documentation mentions a helpful function.

… and the S_IF* functions are

S_IMODE($mode)
the part of $mode containing the permission bits and the setuid/setgid/sticky bits

In ext/Fcntl/Fcntl.xs, we find its implementation and a familiar constant on the last line.

void
S_IMODE(...)
    PREINIT:
        dXSTARG;
        SV *mode;
    PPCODE:
        if (items > 0)
            mode = ST(0);
        else {
            mode = &PL_sv_undef;
            EXTEND(SP, 1);
        }
        PUSHu(SvUV(mode) & 07777);

To avoid the bad practice of magic numbers in source code, write

my $permissions = S_IMODE $mode;

Using S_IMODE and other functions available from the Fcntl module also hides the low-level bit twiddling and focuses on the domain-level information the program wants. The documentation continues

S_IFMT($mode)
the part of $mode containing the file type which can be bit-anded with (for example) S_IFREG or with the following functions

# The operators -f, -d, -l, -b, -c, -p, and -S.
S_ISREG($mode) S_ISDIR($mode) S_ISLNK($mode)
S_ISBLK($mode) S_ISCHR($mode) S_ISFIFO($mode) S_ISSOCK($mode)

# No direct -X operator counterpart, but for the first one
# the -g operator is often equivalent.  The ENFMT stands for
# record flocking enforcement, a platform-dependent feature.
S_ISENFMT($mode) S_ISWHT($mode)

Using these constants and functions will make your programs clearer by more directly expressing your intent.

Solution 2

It is explained in perldoc -f stat, which is where I assume you found this example:

Because the mode contains both the file type and its
permissions, you should mask off the file type portion and
(s)printf using a "%o" if you want to see the real permissions.

The output of printf "%04o", 420 is 0644 which is the permissions on your file. 420 is just the decimal representation of the octal number 0644.

If you try and print the numbers in binary form, it is easier to see:

perl -lwe 'printf "%016b\n", 33188'
1000000110100100
perl -lwe 'printf "%016b\n", 33188 & 07777'
0000000110100100

As you'll notice, bitwise and removes the leftmost bit in the number above, which presumably represents file type, leaving you with only the file permissions. This number 07777 is the binary number:

perl -lwe 'printf "%016b\n", 07777'
0000111111111111

Which acts as a "mask" in the bitwise and. Since 1 & 1 = 1, and 0 & 1 = 0, it means that any bit that is not matched by a 1 in 07777 is set to 0.

Share:
18,425

Related videos on Youtube

chidori
Author by

chidori

Crazy about programming :D

Updated on September 18, 2022

Comments

  • chidori
    chidori over 1 year

    I have been trying to understand what exactly is happening in the below mentioned code. But i am not able to understand it.

    $mode = (stat($filename))[2];
    printf "Permissions are %04o\n", $mode & 07777;
    

    Lets say my $mode value is 33188

    $mode & 07777 yields a value = 420

    • is the $mode value a decimal number ?

    • why we are choosing 07777 and why we are doing a bitwise and operation. I am not able to underand the logic in here.

    • hobbs
      hobbs about 11 years
      It's not a "decimal number"; it's a number. If you print it in decimal, you print it in decimal, and if you print it in octal, you print it in octal.
  • chidori
    chidori about 11 years
    Thanks. now i understand it better but i still have some doubts on this. ok, so 0 in 07777 is used for masking the file type but what is with 7777 over there? why we are choosing 07777 and not any other number like 03333 and so. I know it has something to do with permissions but its kind of vague. Please help me understand this
  • TLP
    TLP about 11 years
    @chidori No, the 0 is there to tell perl that that is an octal number. Numbers are the same, they just have different representations. 07777 is the same number as 4095, which is the same number as 0b111111111111. If you have 14 oranges, you also have 0b1110 oranges, or 016, or even 0xE.
  • chidori
    chidori about 11 years
    Thanks a lot for brief explanation. But i just don't get this. What if i use some other number say..06666 to perform my bitwise AND operation? Why cannot i chose some random number and try to perform bitwise AND to our $mode output
  • Greg Bacon
    Greg Bacon about 11 years
    @chidori Using a different mask with the bitwise AND allows you to be selective about which bits you are querying. Get all that are set with S_IMODE($mode). Using $mode & 06666 “masks off” the sticky bit and all execute permission bits for owner, group, and other. To get the group’s permissions only, write $mode & 0070. Even better, use symbolic masks to be clear about what you are selecting, e.g., if ($mode & (S_IWUSR | S_IWGRP | S_IWOTH)) { ... } to test whether any write permission bits are on.
  • chidori
    chidori about 11 years
    After reading your detailed explanation again and again , i am now able to understand why we use 07777 . Thanks a lot :)
  • Greg Bacon
    Greg Bacon about 11 years
    @chidori Great! I'm glad you had a breakthrough.