File format differences between a static library (.a) and a shared library (.so)?

10,277

Solution 1

A static library, e.g. libfoo.a is not an executable of any kind. It is simply an indexed archive in unix ar format of other files which happen to be ELF object files.

A static library is created like any archive:

ar crs libfoo.a objfile0.o objfile1.0...objfileN.o

outputs the new archive (c) libfoo.a, with those object files inserted (r) and index added (s).

You'll hear of linking libfoo.a in a program. This doesn't mean that libfoo.a itself is linked into or with the program. It means that libfoo.a is passed to the linker as an archive from which it can extract and link into the program just those object files within the archive that the program needs. So the format of a static libary (ar format) is just an object-file bundling format for linker input: it could equally well have been some other bundling format without any effect on the linker's mission, which is to digest a set of object files and shared libraries and generate a program, or shared library, from them. ar format was history's choice.

On the other hand a shared library, e.g. libfoo.so, is an ELF file and not any sort of archive.

Don't be tempted to suspect that a static library is a sort of ELF file by the fact that all the well-known ELF-parsers - objdump, readelf, nm - will parse a static libary. These tools all know that a static library is an archive of ELF object files, so they just parse all the object files in the library as if you had listed them on the commandline.

The use of the -D option with nm just instructs the tool to select only the symbols that are in the dynamic symbol table(s), if any, of the ELF file(s) that it parses - the symbols visible to the runtime linker - regardless of whether or not they are parsed from within an archive. It's the same as objdump -T and readelf --dyn-syms. It is not necessary to use these options to parse the symbols from a shared library. If you don't do so, then by default you'll just see the full symbol table. If you run nm -D on a static library you'll be told no symbols, for each object file in the archive - likewise if you ran nm -D for each of those object files individually. The reason for that is that an object file hasn't got a dynamic symbol table: only a shared library or progam has one.

Object file, shared library and program are all variants of the ELF format. If you're interested in ELF variants, those are the variants of interest.

The ELF format itself is a long and thorny technical read and is required background for precisely distinguishing the variants. Intro: An ELF file contains a ELF header structure one of whose fields contains a type-identifier of the file as an object file, shared library, or program. When the file is a program or shared library, it also contains an optional Program header table structure whose fields provide the runtime linker/loader with the parameters it needs to load the file in a process. In terms of ELF structure, the differences between a program and a shared library are slight: it's the detailed content that makes the difference to the behaviour that they elicit from the loader.

For the long and thorny technical read, try Excutable and Linkable Format (ELF)

Solution 2

Source

The source code I'm using in my example is as follows:

class T {
public:
    T(int _x) : x(_x) { }
    T& operator=(const T& rhs) { x = rhs.x; return *this; }
    int getX() const { return x; }

private:
    int x = 0;
};

Creating the shared library

$ g++ -shared -fPIC -c test.cpp -o test.out && ld -o libtest.so test.out 
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078

Creating the static library

$ g++ -fPIC -c test.cpp -o test.out && ar rcs libtest.a test.out

Are they both ELF files?

Kind of ... here's the output of readelf -h for the shared library:

$ readelf -h libtest.so 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          408 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           64 (bytes)
  Number of section headers:         5
  Section header string table index: 2

The static library output is very similar, but not quite the same:

$ readelf -h libtest.a

File: libtest.a(test.out)
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          360 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         9
  Section header string table index: 6

The first thing that jumps out is the File entry in the static library. Rather than being an ELF object, it contains an ELF object. Another way of confirming this is by looking at the files with hexdump -C (truncated). First, the shared library:

$ hexdump -C libtest.so
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 3e 00 01 00 00 00  78 00 40 00 00 00 00 00  |..>.....x.@.....|
00000020  40 00 00 00 00 00 00 00  98 01 00 00 00 00 00 00  |@...............|
00000030  00 00 00 00 40 00 38 00  01 00 40 00 05 00 02 00  |[email protected]...@.....|
00000040  51 e5 74 64 06 00 00 00  00 00 00 00 00 00 00 00  |Q.td............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000070  10 00 00 00 00 00 00 00  47 43 43 3a 20 28 47 4e  |........GCC: (GN|

We can see the character sequence ELF quite clearly here, right at the start of the file. Here's the static library output:

$ hexdump -C libtest.a
00000000  21 3c 61 72 63 68 3e 0a  2f 20 20 20 20 20 20 20  |!<arch>./       |
00000010  20 20 20 20 20 20 20 20  31 34 38 35 34 36 31 31  |        14854611|
00000020  36 36 20 20 30 20 20 20  20 20 30 20 20 20 20 20  |66  0     0     |
00000030  30 20 20 20 20 20 20 20  34 20 20 20 20 20 20 20  |0       4       |
00000040  20 20 60 0a 00 00 00 00  74 65 73 74 2e 6f 75 74  |  `.....test.out|
00000050  2f 20 20 20 20 20 20 20  31 34 38 35 34 36 31 31  |/       14854611|
00000060  36 36 20 20 31 30 30 30  20 20 31 30 30 30 20 20  |66  1000  1000  |
00000070  31 30 30 36 36 34 20 20  39 33 36 20 20 20 20 20  |100664  936     |
00000080  20 20 60 0a 7f 45 4c 46  02 01 01 00 00 00 00 00  |  `..ELF........|
00000090  00 00 00 00 01 00 3e 00  01 00 00 00 00 00 00 00  |......>.........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 68 01 00 00  |............h...|
000000b0  00 00 00 00 00 00 00 00  40 00 00 00 00 00 40 00  |........@.....@.|
000000c0  09 00 06 00 00 47 43 43  3a 20 28 47 4e 55 29 20  |.....GCC: (GNU) 

We can see a bunch of extra stuff before the ELF header starts here, confirming our hypothesis that a static library is stored differently from a shared library.

Another difference is the Type entry; the shared library is marked as executable whilst the static library is not. In fact, there's not much difference between a shared library and an executable at all: https://askubuntu.com/questions/690631/executables-vs-shared-objects

Share:
10,277
silverscania
Author by

silverscania

Updated on June 05, 2022

Comments

  • silverscania
    silverscania about 2 years

    I know that there are lots of questions about the use cases of shared vs static libraries, this question is not about that. I am asking about differences in file format stored on disk.

    Why question is, what are the differences between the two? Or are they exactly the same, different only in terms of usage?

    I am lead to believe that they are not the same, since running 'nm' on a shared library requires the -D flag. Clearly it needs to do something differently. Why?

    Are they both ELF files?

    Is the only difference that the shared library can contain some paths of dependencies?

    • OMGtechy
      OMGtechy over 7 years
    • yugr
      yugr over 7 years
    • silverscania
      silverscania over 7 years
      @yugr not a duplicate. None of those answers mention anything about the internal format of the files.
    • OMGtechy
      OMGtechy over 7 years
      @yugr I do not think this is a duplicate, as that accept answer lists use cases of each library file, rather than about the serialised format (which is what is being asked here).
    • Dietrich Epp
      Dietrich Epp over 7 years
      A shared library is a linked image (ELF) which can be loaded into memory. A static library is an archive containing unlinked relocatable files (also ELF) which can be linked into something which can then be loaded into memory.
    • Peter Mortensen
      Peter Mortensen over 6 years
  • yugr
    yugr over 7 years
    "We can see a bunch of extra stuff before the ELF header starts here" - static library is an ar archive of relocatable ELF files, not ELF.
  • OMGtechy
    OMGtechy over 7 years
    @yugr yes, that's what I meant by "Rather than being an ELF object, it contains an ELF object" :)
  • Timur Fayzrakhmanov
    Timur Fayzrakhmanov over 6 years
    Damn great! Simple and straightforward. This is exactly what I needed. Thank you a lot!
  • Victor Choy
    Victor Choy over 4 years
    The demo is good. But there is an error. Normally .so dynamic library type in the header is --- Type: DYN , not EXEC.
  • Peter Parker
    Peter Parker almost 4 years
    Great answer.. helped me a lot to understand the layers behind the linker..