The difference between initrd and initramfs?

55,471

Solution 1

Dentry (and inode) cache

Filesystem subsystem in Linux has three layers. The VFS (virtual filesystem), which implements the system calls interface and handles crossing mountpoints and default permission and limits checks. Below it are the drivers for individual filesystems and those in turn interface to drivers for block devices (disks, memory cards, etc.; network interfaces are exception).

The interface between VFS and filesystem are several classes (it's plain C, so structures containing pointers to functions and such, but it's object-oriented interface conceptually). The main three classes are inode, which describes any object (file or directory) in a filesystem, dentry, which describes entry in a directory and file, which describes file open by a process. When mounted, the filesystem driver creates inode and dentry for it's root and the other ones are created on demand when process wants to access a file and eventually expired. That's a dentry and inode cache.

Yes, it does mean that for every open file and any directory down to root there has to be inode and dentry structures allocated in kernel memory representing it.

Page cache

In Linux, each memory page that contains userland data is represented by unified page structure. This might mark the page as either anonymous (might be swapped to swap space if available) or associate it with inode on some filesystem (might be written back to and re-read from the filesystem) and it can be part of any number of memory maps, i.e. visible in address space of some process. The sum of all pages currently loaded in memory is the page cache.

The pages are used to implement mmap interface and while regular read and write system calls can be implemented by the filesystem by other means, majority of interfaces uses generic function that also uses pages. There are generic functions, that when file read is requested allocate pages and call the filesystem to fill them in, one by one. For block-device-based filesystem, it just calculates appropriate addresses and delegates this filling to the block device driver.

ramdev (ramdisk)

Ramdev is regular block device. This allows layering any filesystem on top of it, but it is restricted by the block device interface. And that has just methods to fill in a page allocated by the caller and write it back. That's exactly what is needed for real block devices like disks, memory cards, USB mass storage and such, but for ramdisk it means, that the data exist in memory twice, once in the memory of the ramdev and once in the memory allocated by the caller.

This is the old way of implementing initrd. From times when initrd was rare and exotic occurence.

tmpfs

Tmpfs is different. It's a dummy filesystem. The methods it provides to VFS are the absolute bare minimum to make it work (as such it's excellent documentation of what the inode, dentry and file methods should do). Files only exist if there is corresponding inode and dentry in the inode cache, created when the file is created and never expired unless the file is deleted. The pages are associated to files when data is written and otherwise behave as anonymous ones (data may be stored to swap, page structures remain in use as long as the file exists).

This means there are no extra copies of the data in memory and the whole thing is a lot simpler and due to that slightly faster too. It simply uses the data structures, that serve as cache for any other filesystem, as it's primary storage.

This is the new way of implementing initrd (initramfs, but the image is still called just initrd).

It is also the way of implementing "posix shared memory" (which simply means tmpfs is mounted on /dev/shm and applications are free to create files there and mmap them; simple and efficient) and recently even /tmp and /run (or /var/run) often have tmpfs mounted especially on notebooks to keep disks from having to spin up or avoid some wear in case of SSDs.

Solution 2

I think you are right in all.

The difference is easy to see if you follow the steps needed when booting:

initrd

  • A ramdev block device is created. It is a ram-based block device, that is a simulated hard disk that uses memory instead of physical disks.
  • The initrd file is read and unzipped into the device, as if you did zcat initrd | dd of=/dev/ram0 or something similar.
  • The initrd contains an image of a filesystem, so now you can mount the filesystem as usual: mount /dev/ram0 /root. Naturally, filesystems need a driver, so if you use ext2, the ext2 driver has to be compiled in-kernel.
  • Done!

initramfs

  • A tmpfs is mounted: mount -t tmpfs nodev /root. The tmpfs doesn't need a driver, it is always on-kernel. No device needed, no additional drivers.
  • The initramfs is uncompressed directly into this new filesystem: zcat initramfs | cpio -i, or similar.
  • Done!

And yes, it is still called initrd in many places although it is a initramfs, particularly in boot loaders, as for them it is just a BLOB. The difference is made by the OS when it boots.

Solution 3

Minimal runnable QEMU examples and newbie explanation

In this answer, I will:

  • provide a minimal runnable Buildroot + QEMU example for you to test things out
  • explain the most fundamental difference between both for the very beginners who are likely googling this

Hopefully these will serve as a basis to verify and understand the more internals specifics details of the difference.

The minimal setup is fully automated here, and this is the corresponding getting started.

The setup prints out the QEMU commands as they are run, and as explained in that repo, we can easily produce the three following working types of boots:

  1. root filesystem is in an ext2 "hard disk":

    qemu-system-x86_64 -kernel normal/bzImage -drive file=rootfs.ext2
    
  2. root filesystem is in initrd:

    qemu-system-x86_64 -kernel normal/bzImage -initrd rootfs.cpio
    

    -drive is not given.

    rootfs.cpio contains the same files as rootfs.ext2, except that they are in CPIO format, which is similar to .tar: it serializes directories without compressing them.

  3. root filesystem is in initramfs:

    qemu-system-x86_64 -kernel with_initramfs/bzImage
    

    Neither -drive nor -initrd are given.

    with_initramfs/bzImage is a kernel compiled with options identical to normal/bzImage, except for one: CONFIG_INITRAMFS_SOURCE=rootfs.cpio pointing to the exact same CPIO as from the -initrd example.

By comparing the setups, we can conclude the most fundamental properties of each:

  1. in the hard disk setup, QEMU loads bzImage into memory.

    This work is normally done by bootloaders / firmware do in real hardware such as GRUB.

    The Linux kernel boots, then using its drivers reads the root filesystem from disk.

  2. in the initrd setup, QEMU does some further bootloader work besides loading the kernel into memory: it also:

    This time then, the kernel just uses the rootfs.cpio from memory directly, since no hard disk is present.

    Writes are not persistent across reboots, since everything is in memory

  3. in the initramfs setup, we build the kernel a bit differently: we also give the rootfs.cpio to the kernel build system.

    The kernel build system then knows how to stick the kernel image and the CPIO together into a single image.

    Therefore, all we need to do is to pass the bzImage to QEMU. QEMU loads it into image, just like it did for the other setups, but nothing else is required: the CPIO also gets loaded into memory since it is glued to the kernel image!

Solution 4

To add another noteworthy difference between initrd and initramfs not mentioned in the excellent answer above.

  • With initrd the kernel by default hands over to userspace pid 1 at /sbin/init
  • Newer initramfs however changes things up and executes pid 1 at /init

as it could become a pitfall (see https://unix.stackexchange.com/a/147688/24394)

Share:
55,471

Related videos on Youtube

Amumu
Author by

Amumu

Updated on June 15, 2020

Comments

  • Amumu
    Amumu almost 4 years

    As far as I know, initrd acts as a block device, thus requiring a filesystem driver (such as ext2). The kernel must have at least one built-in module for detecting filesystem of initrd. In this article, Introducing initramfs, a new model for initial RAM disks, it is written that:

    But ramdisks actually waste even more memory due to caching. Linux is designed to cache all files and directory entries read from or written to block devices, so Linux copies data to and from the ramdisk into the "page cache" (for file data), and the "dentry cache" (for directory entries). The downside of the ramdisk pretending to be a block device is it gets treated like a block device.

    What's page cache and dentry cache? In the paragraph, does it mean the data got duplicated because ramdisk is treated as a block device, thus all the data is cached?

    In constrast, ramfs:

    A few years ago, Linus Torvalds had a neat idea: what if Linux's cache could be mounted like a filesystem? Just keep the files in cache and never get rid of them until they're deleted or the system reboots? Linus wrote a tiny wrapper around the cache called "ramfs", and other kernel developers created an improved version called "tmpfs" (which can write the data to swap space, and limit the size of a given mount point so it fills up before consuming all available memory). Initramfs is an instance of tmpfs.

    These ram based filesystems automatically grow or shrink to fit the size of the data they contain. Adding files to a ramfs (or extending existing files) automatically allocates more memory, and deleting or truncating files frees that memory. There's no duplication between block device and cache, because there's no block device. The copy in the cache is the only copy of the data. Best of all, this isn't new code but a new application for the existing Linux caching code, which means it adds almost no size, is very simple, and is based on extremely well tested infrastructure.

    In sum, ramfs is just file opened and loaded into memory, isn't it?

    Both initrd and ramfs are zipped at compile time, but the difference is, initrd is a block device unpacked to be mounted by the kernel at booting, while ramfs is unpacked via cpio into memory. Am I correct? Or is ramfs a very minimal file system?

    Finally, up until this day, the initrd image is still presented in the latest kernel. However, is that initrd actually the ramfs used today and the name is just for historical purpose?

    • BHS
      BHS over 9 years
      Question could have been made more concise.
    • Chan Kim
      Chan Kim over 3 years
      you give good information while asking. I like this kinds of questions. thanks!
  • Amumu
    Amumu about 12 years
    Thanks. Very comprehensive. But one thing: old initrd is loaded as a block device, and as a result, it has to cache all the open file/directories (because a portion is used as a filesystem). However, ramfs is a filesystem image which content will be cached immediately in memory when the kernel unpackaged it, isn't it?
  • rodrigo
    rodrigo about 12 years
    @Amumu - initrd is a block device, and simply put, block devices are cached. initramfs is not a filesystem image, it is just a compressed cpio file; it is uncompressed into the tmpfs, just as when you decompress a zip file.
  • Jan Hudec
    Jan Hudec about 12 years
    @Amumu: Saying initramfs is immediately cached suggests there are two copies. But it's the exact oposite. It is stored in the cache, (ab)using it as the primary storage for the data, so the original compressed copy loaded by the bootloader may be immediately discarded.
  • Piotr Dobrogost
    Piotr Dobrogost almost 9 years
    @JanHudec It is stored in the cache, (ab)using it as the primary storage for the data (...) Isn't there a risk that cache gets discarded (as it's often the case with caches which are volatile by their nature) and thus data would get lost? Is there some mechanism which allows to lock such cache preventing this and if so what is it?
  • Jan Hudec
    Jan Hudec almost 9 years
    @PiotrDobrogost: As any sane cache, the page cache only discards content that is not modified compared to the backing store. Since there is no backing store for tmpfs, it can't discard the pages (if there is swap, it can use swap as ad-hoc store for pages that don't have their own).
  • Pavel Šimerda
    Pavel Šimerda about 5 years
    Wasn't it PID 1?
  • Pavel Šimerda
    Pavel Šimerda about 5 years
    For future reference it's better to cite the sources rather than just refer to them, i.e. to include all the necessary steps right here.
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com about 5 years
    @PavelŠimerda hi, I usually do, but in this case I judged it would be to much code for an answer.
  • Piskvor left the building
    Piskvor left the building almost 5 years
    @PavelŠimerda: Definitely. PID 0 is an idle-CPU metaphor, PID 1 is - in most respects - a normal process.
  • Adrian
    Adrian over 2 years
    What is "nodev" in mount -t tmpfs nodev /root? The only mentions of it that I saw in the manpages makes me think you forgot to specify the '-o' option.
  • rodrigo
    rodrigo over 2 years
    @Adrian: The call to mount requires two arguments, the "device" and the "mountpoint", and usually you specify the type of mount with -t <type>. But tmpfs mounts do not have a device, because the make the data out of RAM memory. So the "device" argument to the mount mount for a tmpfs is required but not used, and can be any text you want, but as a convention it is usually nodev.