Why not use pathless shebangs?

shell shebang

5,153

Solution 1

PATH lookup is a feature of the standard C library in userspace, as are environment variables in general. The kernel doesn't see environment variables except when it passes over an environment from the caller of execve to the new process.

The kernel does not perform any interpretation on the path in execve (it's up to wrapper functions such as execvp to perform PATH lookup) or in a shebang (which more or less re-routes the execve call internally). So you need to put the absolute path in the shebang¹. The original shebang implementation was just a few lines of code, and it hasn't been significantly expanded since.

In the first versions of Unix, the shell did the work of invoking itself when it noticed you were invoking a script. Shebang was added in the kernel for several reasons (summarizing the rationale by Dennis Ritchie:

The caller doesn't have to worry whether a program to execute is a shell script or a native binary.
The script itself specifies what interpreter to use, instead of the caller.
The kernel uses the script name in logs.

Pathless shebangs would require either to augment the kernel to access environment variables and process PATH, or to have the kernel execute a userspace program that performs the PATH lookup. The first method requires adding a disproportionate amount of complexity to the kernel. The second method is already possible with a #!/usr/bin/env shebang.

¹ _{If you put a relative path, it's interpreted relatively to the current directory of the process (not the directory containing the script), which is hardly useful in a shebang.}

Solution 2

There's more going on than meets the eye. #! lines get interpreted by the Unix or Linux kernel, #! isn't an aspect of shells. This means that PATH doesn't really exist at the time the kernel decides what to execute.

The most common way to deal with not knowing which executable to run, or to call perl in a portable fashion or similar, is to use #!/usr/bin/env perl. The kernel executes /usr/bin/env, which inherits a PATH environment variable. env finds (in this example) perl in PATH and uses the execve(2) system call to get the kernel to run the perl executable.

Solution 3

$ strace sleep 1
execve("/usr/bin/sleep", ["sleep", "1"], [/* 99 vars */]) = 0

The conversion to the full path is done by the shell (more general: in userspace). The kernel expects a file name / path it can access directly.

If you want the system find your executable by looking through the PATH variable, you can rewrite your shebang as #!/usr/bin/env EXEC.

But also in this case it's not the kernel who does the search.

5,153

Amelio Vazquez-Reina

I'm passionate about people, technology and research. Some of my favorite quotes: "Far better an approximate answer to the right question than an exact answer to the wrong question" -- J. Tukey, 1962. "Your title makes you a manager, your people make you a leader" -- Donna Dubinsky, quoted in "Trillion Dollar Coach", 2019.

Updated on September 18, 2022

Comments

Amelio Vazquez-Reina over 1 year

Is it possible to have a shebang that, instead of specifying a path to an interpreter, it has the name of the interpreter, and lets the shell find it through $PATH?

If not, is there a reason why?
Stéphane Chazelas almost 11 years

No, the kernel does not require an absolute path in execve nor in the shebang though it makes little sense to have a relative path in a shebang.
yrajabi almost 8 years

The conversion to the full path is done by the shell Thanks, although... is the example supposed to illustrate that? As I see it, the shell is just running strace (converted to /usr/bin/strace at some point) with 2 arguments.
TamusJRoyce over 5 years

#!/usr/bin/env -S [shebang] was required for me running node without knowing its path (using nvm - which places it in a different location than I originally expected).
Tim Ruehsen rockdaboot almost 5 years

-S is only available since coreutils 8.30. See gitlab.com/gnuwget/wget/commit/….