Which process is `/proc/self/` for?

49,347

Solution 1

This has nothing to do with foreground and background processes; it only has to do with the currently running process. When the kernel has to answer the question “What does /proc/self point to?”, it simply picks the currently-scheduled pid, i.e. the currently running process (on the current logical CPU). The effect is that /proc/self always points to the asking program's pid; if you run

ls -l /proc/self

you'll see ls's pid, if you write code which uses /proc/self that code will see its own pid, etc.

Solution 2

The one that accesses the symlink (calls readlink() on it, or open() on a path through it). It would be running on the CPU at the time, but that's not relevant. A multiprocessor system could have several processes on the CPU simultaneously.

Foreground and background processes are mostly a shell construct, and there's no unique foreground process either, since all shell sessions on the system will have one.

Solution 3

The wording could have been better but then again any wording you try to compose to express the idea of self reference is going to be confusing. The name of the directory is more descriptive in my opinion.

Basically, /proc/self/ represents the process that's reading /proc/self/. So if you try to open /proc/self/ from a C program then it represents that program. If you try to do it from the shell then it is the shell etc.

But what if you have a quad core CPU capable of running 4 processes simultaneously, for real, not multitasking?

Then each process will see a different /proc/self/ for real without being able to see each other's /proc/self/.

How does this work?

Well, /proc/self/ is not really a folder. It is a device driver that happens to expose itself as a folder if you try to access it. This is because it implements the API necessary for folders. The /proc/self/ directory is not the only thing that does this. Consider shared folders mounted from remote servers or mounting USB thumbdrives or dropbox. They all work by implementing the same set of APIs that make them behave like folders.

When a process tries to access /proc/self/ the device driver will generate its contents dynamically by reading data from that process. So the files in /proc/self/ does not really exist. It's kind of like a mirror that reflects back on the process that tries to look at it.

Is it really a device driver? You sound like you're oversimplifying things!

Yes, it really is. If you want to be pedantic it's a kernel module. But if you check out usenet postings on the various Linux developers channels most kernel developers use "device driver" and "kernel module" interchangeably. I used to write device drivers, err... kernel modules, for Linux. If you want to write your own interface in /proc/, say for example you want a /proc/unix.stackexchange/ filesystem that returns posts from this website you can read about how to do it in the venerable "Linux Device Drivers" book published by O'Reilly. It's even available as softcopy online.

Solution 4

It's whichever process happens to be accessing /proc/self or the files/folders therein.

Try cat /proc/self/cmdline. You will get, surprise surprise, cat /proc/self/cmdline, (actually, instead of a space there will be a null character between the t and the /) because it will be the cat process accessing this pseudofile.

When you do an ls -l /proc/self, you will see the pid of the ls process itself. Or how about ls -l /proc/self/exe; it will point to the ls executable.

Or try this, for a change:

$ cp /proc/self/cmdline /tmp/cmd
$ hexdump -C /tmp/cmd
00000000  63 70 00 2f 70 72 6f 63  2f 73 65 6c 66 2f 63 6d  |cp./proc/self/cm|
00000010  64 6c 69 6e 65 00 2f 74  6d 70 2f 63 6d 64 00     |dline./tmp/cmd.|
0000001f

or even

$ hexdump -C /proc/self/cmdline 
00000000  68 65 78 64 75 6d 70 00  2d 43 00 2f 70 72 6f 63  |hexdump.-C./proc|
00000010  2f 73 65 6c 66 2f 63 6d  64 6c 69 6e 65 00        |/self/cmdline.|
0000001e

As I said, it is whichever process happens to be accessing /proc/self or the files/folders therein.

Solution 5

/proc/self is syntactic sugar. It's a shortcut to contatenating /proc/ and the result of the getpid() syscall (accessible in bash as the metavariable $$). It can get confusing, tho, in the case of shell scripting, as many of the statements invoke other processes, complete with the own PIDs... PIDs that refer to, more often than not, dead processes. Consider:

root@vps01:~# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 Jan  1 01:51 0 -> /dev/pts/0
lrwx------ 1 root root 64 Jan  1 01:51 1 -> /dev/pts/0
lrwx------ 1 root root 64 Jan  1 01:51 2 -> /dev/pts/0
lr-x------ 1 root root 64 Jan  1 01:51 3 -> /proc/26562/fd
root@vps01:~# echo $$
593

'/bin/ls' will evaluate the path to the directory, resolving it as /proc/26563, since that's the PID of the process - the newly created /bin/ls process - that reads the contents of the directory. But by the time the next process in the pipeline, in the case of shell scripting, or by the time the prompt comes back, in the case of an interactive shell, the path no longer exists and the information output refers to a nonexistent process.

This only applies to external commands, however (ones that are actual executable program files, as opposed to being built into the shell itself). So, you'll get different results if you, say, use filename globbing to obtain a list of the contents of the directory, rather than passing the path name to the external process /bin/ls:

root@vps01:~# ls /proc/self/fd
0  1  2  3
root@vps01:~/specs# echo /proc/self/fd/*
/proc/self/fd/0 /proc/self/fd/1 /proc/self/fd/2 /proc/self/fd/255 /proc/self/fd/3

In the first line, the shell spawned a new process, '/bin/ls', via the exec() syscall, passing "/proc/self/fd" as argv[1]. '/bin/ls', in turn, opened the directory /proc/self/fd and read, then printed, its contents as it iterated over them.

The second line, however, uses glob() behind the scenes to expand the list of filenames; these are passed as an array of strings to echo. (Usually implemented as an internal command, but there's often also a /bin/echo binary... but that part's actually irrelevant, since echo is only dealing with strings it never feeds to any syscall related to path names.)

Now, consider the following case:

root@vps01:~# cd /proc/self/fd
root@vps01:~# ls
0  1  2  255

Here, the shell, the parent process of /bin/ls, has made a subdirectory of /proc/self its current directory. Thus, relative pathnames are evaluated from its perspective. My best guess is that this is related to the POSIX file semantics where you can create multiple hard links to a file, including any open file descriptors. So this time, /bin/ls behaves similarly to echo /proc/$$/fd/*.

Share:
49,347

Related videos on Youtube

Tim
Author by

Tim

Elitists are oppressive, anti-intellectual, ultra-conservative, and cancerous to the society, environment, and humanity. Please help make Stack Exchange a better place. Expose elite supremacy, elitist brutality, and moderation injustice to https://stackoverflow.com/contact (complicit community managers), in comments, to meta, outside Stack Exchange, and by legal actions. Push back and don't let them normalize their behaviors. Changes always happen from the bottom up. Thank you very much! Just a curious self learner. Almost always upvote replies. Thanks for enlightenment! Meanwhile, Corruption and abuses have been rampantly coming from elitists. Supportive comments have been removed and attacks are kept to control the direction of discourse. Outright vicious comments have been removed only to conceal atrocities. Systematic discrimination has been made into policies. Countless users have been harassed, persecuted, and suffocated. Q&A sites are for everyone to learn and grow, not for elitists to indulge abusive oppression, and cover up for each other. https://softwareengineering.stackexchange.com/posts/419086/revisions https://math.meta.stackexchange.com/q/32539/ (https://i.stack.imgur.com/4knYh.png) and https://math.meta.stackexchange.com/q/32548/ (https://i.stack.imgur.com/9gaZ2.png) https://meta.stackexchange.com/posts/353417/timeline (The moderators defended continuous harassment comments showing no reading and understanding of my post) https://cs.stackexchange.com/posts/125651/timeline (a PLT academic had trouble with the books I am reading and disparaged my self learning posts, and a moderator with long abusive history added more insults.) https://stackoverflow.com/posts/61679659/revisions (homework libels) Much more that have happened.

Updated on September 18, 2022

Comments

  • Tim
    Tim almost 2 years

    https://www.centos.org/docs/5/html/5.2/Deployment_Guide/s3-proc-self.html says

    The /proc/self/ directory is a link to the currently running process.

    There are always multiple processes running concurrently, so which process is "the currently running process"?

    Does "the currently running process" have anything to do with which process is currently running on the CPU, considering context switching?

    Does "the currently running process" have nothing to do with foreground and background processes?

    • Charles Duffy
      Charles Duffy over 7 years
      The process that evaluates /proc/self, of course.
    • Jeffrey Bosboom
      Jeffrey Bosboom over 7 years
      Which person do I and me refer to?
  • clerksx
    clerksx over 7 years
    /proc/self is not a device driver, but is instead part of a a kernel-exposed filesystem called procfs.
  • slebetman
    slebetman over 7 years
    @ChrisDown: Yes but it's implemented as a kernel module - which is linux's version of device driver - there's even an example implementation of a /proc based driver in the venerable book "Linux Device Drivers". I should know, I implemented one in college. I probably could have used the term "kernel module" instead but "device driver" is what most people are familiar with and I don't want to give the misleading impression that there's a significant difference between "kernel module" and "device driver" apart from terminology.
  • hobbs
    hobbs over 7 years
    @slebetman well, procfs isn't a module per se, it can only be built in, never built as a module. If you want to split hairs, the hair to split is that it's a filesystem driver, not a device driver
  • R.. GitHub STOP HELPING ICE
    R.. GitHub STOP HELPING ICE over 7 years
    This is "accurate" in a sense, but not meaningful to someone who doesn't understand the kernel's concept of "current". A better answer would be that it's the process making the system call with /proc/self as part of the pathname in one of its arguments.
  • Stephen Kitt
    Stephen Kitt over 7 years
    @R.. that's what ilkkachu's answer highlights, feel free to upvote that one — I did.
  • Darkov
    Darkov over 4 years
    If self means the current process that is scheduled on the logical CPU, why aren't there multiple self entries on a multi-core system?
  • Stephen Kitt
    Stephen Kitt over 4 years
    @Darkov in the same way that there’s only one word for “self” in English even though you and I have distinct selves which exist simultaneously. The kernel always knows which process is asking about /proc/self.
  • Admin
    Admin almost 2 years
    Yeah, I was surprised that ls runs in a separate process. Sounds expensive. Is bash's philosophy is to run every command in a separate process?
  • Admin
    Admin almost 2 years
    @Ivan_Bereziuk it’s not bash’s philosophy, it’s the philosophy of pretty much all operating systems with multiple processes (including single-tasking OSs like DOS): a process which wants to run another program, and regain control once the second program has finished, needs to run it in a separate process. Process creation is cheap on Linux.