Why not use pathless shebangs?
Solution 1
PATH lookup is a feature of the standard C library in userspace, as are environment variables in general. The kernel doesn't see environment variables except when it passes over an environment from the caller of execve
to the new process.
The kernel does not perform any interpretation on the path in execve
(it's up to wrapper functions such as execvp
to perform PATH lookup) or in a shebang (which more or less re-routes the execve
call internally). So you need to put the absolute path in the shebang¹. The original shebang implementation was just a few lines of code, and it hasn't been significantly expanded since.
In the first versions of Unix, the shell did the work of invoking itself when it noticed you were invoking a script. Shebang was added in the kernel for several reasons (summarizing the rationale by Dennis Ritchie:
- The caller doesn't have to worry whether a program to execute is a shell script or a native binary.
- The script itself specifies what interpreter to use, instead of the caller.
- The kernel uses the script name in logs.
Pathless shebangs would require either to augment the kernel to access environment variables and process PATH
, or to have the kernel execute a userspace program that performs the PATH lookup. The first method requires adding a disproportionate amount of complexity to the kernel. The second method is already possible with a #!/usr/bin/env
shebang.
¹ If you put a relative path, it's interpreted relatively to the current directory of the process (not the directory containing the script), which is hardly useful in a shebang.
Solution 2
There's more going on than meets the eye. #!
lines get interpreted by the Unix or Linux kernel, #!
isn't an aspect of shells. This means that PATH
doesn't really exist at the time the kernel decides what to execute.
The most common way to deal with not knowing which executable to run, or to call perl
in a portable fashion or similar, is to use #!/usr/bin/env perl
. The kernel executes /usr/bin/env
, which inherits a PATH
environment variable. env
finds (in this example) perl
in PATH
and uses the execve(2)
system call to get the kernel to run the perl
executable.
Solution 3
$ strace sleep 1
execve("/usr/bin/sleep", ["sleep", "1"], [/* 99 vars */]) = 0
The conversion to the full path is done by the shell (more general: in userspace). The kernel expects a file name / path it can access directly.
If you want the system find your executable by looking through the PATH variable, you can rewrite your shebang as #!/usr/bin/env EXEC
.
But also in this case it's not the kernel who does the search.
Related videos on Youtube
Amelio Vazquez-Reina
I'm passionate about people, technology and research. Some of my favorite quotes: "Far better an approximate answer to the right question than an exact answer to the wrong question" -- J. Tukey, 1962. "Your title makes you a manager, your people make you a leader" -- Donna Dubinsky, quoted in "Trillion Dollar Coach", 2019.
Updated on September 18, 2022Comments
-
Amelio Vazquez-Reina over 1 year
Is it possible to have a shebang that, instead of specifying a path to an interpreter, it has the name of the interpreter, and lets the shell find it through $PATH?
If not, is there a reason why?
-
Stéphane Chazelas almost 11 yearsNo, the kernel does not require an absolute path in
execve
nor in the shebang though it makes little sense to have a relative path in a shebang. -
yrajabi almost 8 yearsThe conversion to the full path is done by the shell Thanks, although... is the example supposed to illustrate that? As I see it, the shell is just running
strace
(converted to/usr/bin/strace
at some point) with 2 arguments. -
TamusJRoyce over 5 years#!/usr/bin/env -S [shebang] was required for me running node without knowing its path (using nvm - which places it in a different location than I originally expected).
-
Tim Ruehsen rockdaboot almost 5 years-S is only available since coreutils 8.30. See gitlab.com/gnuwget/wget/commit/….