Unix/Linux Loader Process

5,497

Solution 1

A user generally encounters three types of ELF files—.o files, regular executables, and shared libraries. While all of these files serve different purposes, their internal structure files are quite similar.

One universal concept among all different ELF file types (and also a.out and many other executable file formats) is the notion of a section. A section is a collection of information of a similar type. Each section represents a portion of the file. For example, executable code is always placed in a section known as .text; all data variables initialized by the user are placed in a section known as .data; and uninitialized data is placed in a section known as .bss.

Actually, one can devise an executable file format where everything is jumbled together(like MS DOS). But dividing executables into sections has important advantages. For example, once you have loaded the executable portions of an executable into memory, these memory locations need not change. On modern machine architectures, the memory manager can mark portions of memory read-only, such that any attempt to modify a read-only memory location results in the program dying and dumping core. Thus, instead of merely saying that we do not expect a particular memory location to change, we can specify that any attempt to modify a read-only memory location is a fatal error indicating a bug in the application. That being said, typically you cannot individually set the read-only status for each byte of memory—instead you can individually set the protections of regions of memory known as pages. On the i386 architecture the page size is 4096 bytes—thus you could indicate that addresses 0-4095 are read-only, and bytes 4096 and up are writable, for example.

Given that we want all executable portions of an executable in read-only memory and all modifiable locations of memory (such as variables) in writable memory, it turns out to be most efficient to group all of the executable portions of an executable into one section of memory (the .text section), and all modifiable data areas together into another area of memory (henceforth known as the .data section).

A further distinction is made between data variables the user has initialized and data variables the user has not initialized. If the user has not specified the initial value of a variable, there is no sense wasting space in the executable file to store the value. Thus, initialized variables are grouped into the .data section, and uninitialized variables are grouped into the .bss section, which is special because it doesn't take up space in the file—it only tells how much space is needed for uninitialized variables.

When you ask the kernel to load and run an executable, it starts by looking at the image header for clues about how to load the image. It locates the .text section within the executable, loads it into the appropriate portions of memory, and marks these pages as read-only. It then locates the .data section in the executable and loads it into the user's address space, this time in read-write memory. Finally, it finds the location and size of the .bss section from the image header, and adds the appropriate pages of memory to the user's address space. Even though the user has not specified the initial values of variables placed in .bss, by convention the kernel will initialize all of this memory to zero.

So you see, it is actually the kernel which issues the orders to load the executable in memory. The text section as a result of any such calls is loaded in read only memory and the data section is loaded in read-write memory.

Solution 2

It depends on the OS.

For example, on GNU Hurd the executable file is loaded by the exec server.

On a more typical monolithic OS, this is done by:

  1. the kernel maps the executable and the dynamic linker in memory;

  2. the dynamic linker mmaps the shared-objects in memory.

The Linux kernel itself is stored as an ELF file: this one is loaded by the bootloader (such as GRUB).

Share:
5,497

Related videos on Youtube

Mat
Author by

Mat

Updated on September 18, 2022

Comments

  • Mat
    Mat almost 2 years

    Can anyone tell me which process of the operating system loads the ELF(Executable and Linking format) file into RAM ?

  • YoloTats.com
    YoloTats.com over 11 years
    It could be added, that dynamic linking is done by ld-linux.so, which is an userland process.
  • The Dark Knight
    The Dark Knight over 11 years
    @jofel: yes, you are quite right. For the sake of the person, who has asked this question let me say that ld-linux.so is an interpreter which is launched from elf headers by the kernel.The target ELF entry point is set in auxiliary vector of type "ENTRY".The kernel opens the requested interpreter, maps the memory regions and starts its execution at ld's ELF entry point. Then the loader analyzes the target ELF file, performs its loader work and sets EIP to target ELF entry point. This is how dynamic linking is done by ld-linux.so .
  • gumenimeda
    gumenimeda almost 9 years
    Well no. The kernel (or the dynamic linker) does not care about the sections (.text, .data, etc.) at all. They use the program header table which describes the segments to load into memory.