Why use cpio for initramfs?

5,816

Solution 1

I'm not 100% sure, but as the initial ramdisk needs to be unpacked by the kernel during boot, cpio is used because it is already implemented in kernel code.

Solution 2

Quoting Documentation/filesystems/ramfs-rootfs-initramfs.txt:

Why cpio rather than tar?

This decision was made back in December, 2001. The discussion started here:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html

And spawned a second thread (specifically on tar vs cpio), starting here:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html

The quick and dirty summary version (which is no substitute for reading the above threads) is:

1) cpio is a standard. It's decades old (from the AT&T days), and already widely used on Linux (inside RPM, Red Hat's device driver disks). Here's a Linux Journal article about it from 1996:

http://www.linuxjournal.com/article/1213

It's not as popular as tar because the traditional cpio command line tools require _truly_hideous_ command line arguments. But that says nothing either way about the archive format, and there are alternative tools, such as:

http://freecode.com/projects/afio

2) The cpio archive format chosen by the kernel is simpler and cleaner (and thus easier to create and parse) than any of the (literally dozens of) various tar archive formats. The complete initramfs archive format is explained in buffer-format.txt, created in usr/gen_init_cpio.c, and extracted in init/initramfs.c. All three together come to less than 26k total of human-readable text.

3) The GNU project standardizing on tar is approximately as relevant as Windows standardizing on zip. Linux is not part of either, and is free to make its own technical decisions.

4) Since this is a kernel internal format, it could easily have been
something brand new. The kernel provides its own tools to create and extract this format anyway. Using an existing standard was preferable, but not essential.

5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be supported on the kernel side"):

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html

explained his reasoning:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html

and, most importantly, designed and implemented the initramfs code.

Solution 3

From what I remember of my old SysV days, cpio could handle dev files, but tar could not; this made cpio the 'raw' backup utility of choice before dump came around. It was also easier to handle partial filesets and hard links so incremental backups were easier. I think that GNU tar has caught up with cpio features so now it is just a matter of user comfortability. Both cpio and tar should be installed by default.

Share:
5,816

Related videos on Youtube

phunehehe
Author by

phunehehe

Updated on September 17, 2022

Comments

  • phunehehe
    phunehehe almost 2 years

    I am making my own initramfs following the Gentoo wiki. Instead of the familiar tar and gzip, the page is telling me to use cpio and gzip. Wikipedia says that cpio is used by the 2.6 kernel's initramfs, but does not explain why.

    Is this just a convention or is cpio better for initramfs? Can I still use tar and gzip?

    • Louis Gerbarg
      Louis Gerbarg over 13 years
      IIRC you cannot use tar as initramfs (I don't post it as answer as I'm not 100% sure). BTW using Gentoo I find much easier to configure built-in initramfs rather then hand-made one.
    • phunehehe
      phunehehe over 13 years
      @Maciej I just want to know how to do it :) Furthermore I'm seeing a big boot time improvement by using my own initramfs
    • Louis Gerbarg
      Louis Gerbarg over 13 years
      You misunderstood me. The method I was talking about is by giving the kernel during configuration a spec file which files should be included in the initrd (including custom /init etc.) and kernel simply uses that one. I'm not taking about generating initramfs by genkernel or similar methods.
    • phunehehe
      phunehehe over 13 years
      @Maciej That looks fun! I'll try it sometime.
    • Louis Gerbarg
      Louis Gerbarg over 13 years
      Well. It's IMHO easier to set up and it autoupdates with kernel (so I don't need to remember to copy new files into initrd).
  • ephemient
    ephemient over 13 years
    Be 100% sure. linux/init/initramfs.c unpacks a cpio -H newc archive.
  • phunehehe
    phunehehe over 13 years
    @ephemient This is really something. If there is no more answer coming in a few days I'll accept that cpio is used as a convention and that we have to use cpio.
  • ephemient
    ephemient over 13 years
    cpio may be able to handle tar-format archives, and vice-versa in some cases, but that doesn't matter. The kernel can only unpack newc-style cpio-format archives, which no tar I know of produces.
  • CMCDragonkai
    CMCDragonkai over 8 years
    Any ideas why newc is the format chosen?
  • lvella
    lvella almost 7 years
    According to kernel documentation, cpio was implemented just for the sake of initramdisk, so they could have implemented any other format.
  • schily
    schily about 6 years
    The format that GNU cpio incorrectly calls newc is officially named asc and of course supported by star.
  • Osama khodroj
    Osama khodroj about 5 years
    @schily: That shows one of the implicit reasons quite nicely. "Well, it's some sort of tar archive. But which of the possible tar formats, and is it compatible with this tar extractor?" OTOH, cpio's version history is far less complicated.
  • Preston L. Bannister
    Preston L. Bannister over 2 years
    My only complaint with this answer is that the site no longer exists for the offered links to the Linux kernel mailing list. (Only just now realized that archive-stable links to Linux kernel mailing list items is probably a good idea.) A better(?) link to the first of the last three: lkml.org/lkml/2001/12/22/132