How does a program inherit environment variables?
Solution 1
The environment variables are passed down from the parent process as a third argument to main
. The easiest way to discover this is to read the documentation for the system call execve
, particularly this bit:
int execve(const char *filename, char *const argv[], char *const envp[]);
Description
execve()
executes the program pointed to byfilename
. [...]argv
is an array of argument strings passed to the new program. By convention, the first of these strings should contain the filename associated with the file being executed.envp
is an array of strings, conventionally of the formkey=value
, which are passed as environment to the new program. Bothargv
andenvp
must be terminated by a NULL pointer. The argument vector and environment can be accessed by the called program's main function, when it is defined as:int main(int argc, char *argv[], char *envp[])
The C library copies the envp
argument into the environ
global variable somewhere in its startup code, before it calls main
: for instance, GNU libc does this in _init
and musl libc does it in __init_libc
. (You may find musl libc's code easier to trace through than GNU libc's.) Conversely, if you start a program using one of the exec
wrapper functions that don't take an explicit environment vector, the C library supplies environ
as the third argument to execve
. Inheritance of environment variables is thus strictly a user-space convention. As far as the kernel is concerned, each program receives two argument vectors, and it doesn't care what's in them.
(Note that three-argument main
is an extension to the C language. The C standard only specifies int main(void)
and int main(int argc, char **argv)
but it permits implementations to define additional forms (C11 Annex J.5.1 Environment Arguments). The three-argument main
has been how environment variables work since Unix V7 if not longer, and is documented by Microsoft too — see What should main()
return in C and C++?.)
Solution 2
Under Linux when a program starts it has its arguments and environmental variables stored on the stack. For C programs the code that executes before main
looks at this, makes the argv
and envp
arrays of pointers, and then calls main
with these values (and argc
).
When a program calls execvpe
to turn into a new program (often after calling fork
) then an envp
is passed in, along with an argv
. The kernel will copy the data at these into the new program's stack.
When any of the other exec
functions are called then the glibc will pass in the current program's environ
as the new program's envp
to execvpe
(or directly to sys_exec).
Solution 3
The question is really, How does the shell run commands?
The answer is by creating a new process probably using fork()
and execl()
, which creates a process with the same environment as the current process.
You can however create a new process with a custom environment using execvpe()
/execle()
.
But in any normal situation that wouldn't be necessary, and specially since many programs expect some environment variables to be defined like PATH
for example, normally a child process inherits the environment variables from the environment where it is invoked.
Solution 4
The father process that calls your program (your shell) defines FOO. The newly created process receives a copy from the parent.
nowox
Software and Electronic Engineer specialized in MotionControl applications.
Updated on June 29, 2022Comments
-
nowox almost 2 years
When I use the function
getenv()
from the Standard C Library, my program inherit the environment variables from its parent.Example:
$ export FOO=42 $ <<< 'int main() {printf("%s\n", getenv("FOO"));}' gcc -w -xc - && ./a.exe 42
In libc, the
environ
variable is declared intoenviron.c
. I am expecting it to be empty at the execution, but I get42
.Going a bit further
getenv
can be simplified as follow:char * getenv (const char *name) { size_t len = strlen (name); char **ep; uint16_t name_start; name_start = *(const uint16_t *) name; len -= 2; name += 2; for (ep = __environ; *ep != NULL; ++ep) { uint16_t ep_start = *(uint16_t *) *ep; if (name_start == ep_start && !strncmp (*ep + 2, name, len) && (*ep)[len + 2] == '=') return &(*ep)[len + 3]; } return NULL; } libc_hidden_def (getenv)
Here I will just get the content of the
__environ
variable. However I never initialized it.So I get confused because
environ
is supposed to beNULL
unless my main function is not the real entry point of my program. Perhapsgcc
is ticking me by adding an_init
function that is part of the standard C library.Where is
environ
initialized?