Does LVM eats my disk space or does df lie?

14,279

Let us do some research. I have noticed that difference before, but never checked in detail what to attribute the losses to. Have a look at my scenario for comparision: fdisk shows the following partition:

/dev/sda3       35657728 1000214527 964556800  460G 83 Linux

There will be some losses as I my filesystem lives in a luks container, but that should only be a few MiB. df shows:

Filesystem      Size  Used Avail Use% Mounted on
/dev/dm-1       453G  373G   58G  87% /

(The luks container is also why /dev/sda3 does not match /dev/dm-1, but they are really the same device, with encryption inbetween, no LVM. This also shows that LVM is not responsible for your losses, I have them too.)

Now lets ask the filesystem itself on that matter. Calling tune2fs -l, which outputs a lot of interesting information about ext-family filesystems, we get:

root@altair ~ › tune2fs -l /dev/dm-1
tune2fs 1.42.12 (29-Aug-2014)
Filesystem volume name:   <none>
Last mounted on:          /
Filesystem UUID:          0de04278-5eb0-44b1-9258-e4d7cd978768
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              30146560
Block count:              120569088
Reserved block count:     6028454
Free blocks:              23349192
Free inodes:              28532579
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      995
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Wed Oct 14 09:27:52 2015
Last mount time:          Sun Mar 13 12:25:50 2016
Last write time:          Sun Mar 13 12:25:48 2016
Mount count:              23
Maximum mount count:      -1
Last checked:             Wed Oct 14 09:27:52 2015
Check interval:           0 (<none>)
Lifetime writes:          1426 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:           256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       26747912
Default directory hash:   half_md4
Directory Hash Seed:      4723240b-9056-4f5f-8de2-d8536e35d183
Journal backup:           inode blocks

Glancing over it, the first which springs into your eyes should be Reserved blocks. Multiplying that with the Block size (also from the output), we get the difference between the df Used+Avail and Size:

453GiB - (373GiB+58GiB) = 22 GiB
6028454*4096 Bytes = 24692547584 Bytes ~= 23 GiB

Close enough, especially considering that df rounds (using df without -h and repeating the calculation leaves only 16 MiB of the difference between Used+Avail and Size unexplained). To whom the reserved blocks are reserved is also written in the tune2fs output. It is root. This is a safety-net to ensure that non-root users cannot make the system entirely unusable by filling the disk, and keeping a few percent of disk space unused also helps against fragmentation.

Now for the difference between the size reported by df and the size of the partition. This can be explained by taking a look at the inodes. ext4 preallocates inodes, so that space is unusable for file data. Multiply the Inode count by the Inode size, and you get:

30146560*256 Bytes = 7717519360 Bytes ~= 7 GiB
453 GiB + 7 GiB = 460 GiB

Inodes are basically directory entries. Let us ask mkfs.ext4 about details (from man mkfs.ext4):

-i bytes-per-inode

Specify the bytes/inode ratio. mke2fs creates an inode for every bytes-per-inode bytes of space on the disk. The larger the bytes-per-inode ratio, the fewer inodes will be created. This value generally shouldn't be smaller than the blocksize of the filesystem, since in that case more inodes would be made than can ever be used. Be warned that it is not possible to change this ratio on a filesystem after it is created, so be careful deciding the correct value for this parameter. Note that resizing a filesystem changes the numer of inodes to maintain this ratio.

There are different presets to use for different scenarios. On a file server with lots of linux distribution images, it makes sense to pass e.g. -T largefile or even -T largefile4. What -T means is defined in /etc/mke2fs.conf, in those examples and on my system:

largefile = {
    inode_ratio = 1048576
}
largefile4 = {
    inode_ratio = 4194304
}

So with -T largefile4, the number of is much less than the default (the default ratio is 16384 in my /etc/mke2fs.conf). This means, less space reserved for directory entries, and more space for data. When you run out of inodes, you cannot create new files. Increasing the number of inodes in an existing filesystem does not seem to be possible. Thus, the default number of inodes is rather conservatively chosen to ensure that the average user does not run out of inodes prematurely.

I just figured that out at poking at my numbers, let me know if it (does not) work for you ☺.

Share:
14,279

Related videos on Youtube

humkins
Author by

humkins

Updated on September 18, 2022

Comments

  • humkins
    humkins over 1 year

    Please look at the output below:

    bob ~ # df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    udev                       5,7G  4,0K  5,7G   1% /dev
    tmpfs                      1,2G  1,5M  1,2G   1% /run
    /dev/mapper/mint--vg-root  218G   66G  142G  32% /
    none                       4,0K     0  4,0K   0% /sys/fs/cgroup
    tmpfs                      5,7G  528M  5,2G  10% /tmp
    none                       5,0M     0  5,0M   0% /run/lock
    none                       5,7G   99M  5,6G   2% /run/shm
    none                       100M   48K  100M   1% /run/user
    tmpfs                      5,7G   44K  5,7G   1% /var/tmp
    /dev/sda1                  236M  132M   93M  59% /boot
    

    df reports that LVM partition has 218G whereas it must be 250G, well 232G if to recalculate with 1024. So where is 14G? But even 218-66=152 not 142! That is 10 more Gigabytes which are also nowhere?

    Other utils output:

    bob ~ # pvs
      PV         VG      Fmt  Attr PSize   PFree
      /dev/sda5  mint-vg lvm2 a--  232,64g    0 
    
    bob ~ # pvdisplay
      --- Physical volume ---
      PV Name               /dev/sda5
      VG Name               mint-vg
      PV Size               232,65 GiB / not usable 2,00 MiB
      Allocatable           yes (but full)
      PE Size               4,00 MiB
      Total PE              59557
      Free PE               0
      Allocated PE          59557
      PV UUID               3FA5KG-Dtp4-Kfyf-STAZ-K6Qe-ojkB-Tagr83
    
    bob ~ # fdisk -l /dev/sda
    
    Disk /dev/sda: 250.1 GB, 250059350016 bytes
    255 heads, 63 sectors/track, 30401 cylinders, total 488397168 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00097b2a
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1   *        2048      499711      248832   83  Linux
    /dev/sda2          501758   488396799   243947521    5  Extended
    /dev/sda5          501760   488396799   243947520   8e  Linux LVM
    
    # sfdisk -l -uM
    
    Disk /dev/sda: 30401 cylinders, 255 heads, 63 sectors/track
    Warning: extended partition does not start at a cylinder boundary.
    DOS and Linux will interpret the contents differently.
    Units = mebibytes of 1048576 bytes, blocks of 1024 bytes, counting from 0
    
       Device Boot Start   End    MiB    #blocks   Id  System
    /dev/sda1   *     1    243    243     248832   83  Linux
    /dev/sda2       244+ 238474  238231- 243947521    5  Extended
    /dev/sda3         0      -      0          0    0  Empty
    /dev/sda4         0      -      0          0    0  Empty
    /dev/sda5       245  238474  238230  243947520   8e  Linux LVM
    
    Disk /dev/mapper/mint--vg-root: 30369 cylinders, 255 heads, 63 sectors/track
    
    sfdisk: ERROR: sector 0 does not have an msdos signature
     /dev/mapper/mint--vg-root: unrecognized partition table type
    No partitions found
    

    Linux Mint 17.3

    UPDATE

    # lvdisplay
      --- Logical volume ---
      LV Path                /dev/mint-vg/root
      LV Name                root
      VG Name                mint-vg
      LV UUID                ew9fDY-oykM-Nekj-icXn-FQ1T-fiaC-0Jw2v6
      LV Write Access        read/write
      LV Creation host, time mint, 2016-02-18 14:52:15 +0200
      LV Status              available
      # open                 1
      LV Size                232,64 GiB
      Current LE             59557
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:0
    

    Regarding swap. Initially it was there, in LVM. Then I removed it and extended root partition with the space which was used by the swap (about 12G)

    UPDATE2

    # tune2fs -l /dev/mapper/mint--vg-root
    tune2fs 1.42.9 (4-Feb-2014)
    Filesystem volume name:   <none>
    Last mounted on:          /
    Filesystem UUID:          0b5ecf9b-a763-4371-b4e7-01c36c47b5cc
    Filesystem magic number:  0xEF53
    Filesystem revision #:    1 (dynamic)
    Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
    Filesystem flags:         signed_directory_hash 
    Default mount options:    user_xattr acl
    Filesystem state:         clean
    Errors behavior:          Continue
    Filesystem OS type:       Linux
    Inode count:              14491648
    Block count:              57952256
    Reserved block count:     2897612
    Free blocks:              40041861
    Free inodes:              13997980
    First block:              0
    Block size:               4096
    Fragment size:            4096
    Reserved GDT blocks:      1010
    Blocks per group:         32768
    Fragments per group:      32768
    Inodes per group:         8192
    Inode blocks per group:   512
    Flex block group size:    16
    Filesystem created:       Thu Feb 18 14:52:49 2016
    Last mount time:          Sun Mar 13 16:49:48 2016
    Last write time:          Sun Mar 13 16:49:48 2016
    Mount count:              22
    Maximum mount count:      -1
    Last checked:             Thu Feb 18 14:52:49 2016
    Check interval:           0 (<none>)
    Lifetime writes:          774 GB
    Reserved blocks uid:      0 (user root)
    Reserved blocks gid:      0 (group root)
    First inode:              11
    Inode size:           256
    Required extra isize:     28
    Desired extra isize:      28
    Journal inode:            8
    First orphan inode:       6160636
    Default directory hash:   half_md4
    Directory Hash Seed:      51743315-0555-474b-8a5a-bbf470e3ca9f
    Journal backup:           inode blocks
    

    UPDATE3 (Final)

    Thanks to Jonas the space loss has been found

    # df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    /dev/mapper/mint--vg-root  218G   65G  142G  32% /
    
    
    # resize2fs /dev/mapper/mint--vg-root
    resize2fs 1.42.9 (4-Feb-2014)
    Filesystem at /dev/mapper/mint--vg-root is mounted on /; on-line resizing required
    old_desc_blocks = 14, new_desc_blocks = 15
    The filesystem on /dev/mapper/mint--vg-root is now 60986368 blocks long.
    
    # df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    /dev/mapper/mint--vg-root  229G   65G  153G  30% /
    

    and this is a diff of tune2fs command output before and after resize2fs running

    # diff /tmp/tune2fs_before_resize2fs /tmp/tune2fs2_after_resize2fs
    13,17c13,17
    < Inode count:              14491648
    < Block count:              57952256
    < Reserved block count:     2897612
    < Free blocks:              40041861
    < Free inodes:              13997980
    ---
    > Inode count:              15253504
    > Block count:              60986368
    > Reserved block count:     3018400
    > Free blocks:              43028171
    > Free inodes:              14759836
    21c21
    < Reserved GDT blocks:      1010
    ---
    > Reserved GDT blocks:      1009
    38c38
    < Inode size:           256
    ---
    > Inode size:             256
    42c42
    < First orphan inode:       6160636
    ---
    > First orphan inode:       5904187
    
    • Jarrod
      Jarrod about 8 years
      My guess would be swap space grep swap /etc/fstab Can you also paste the output of lvdisplay ?
    • humkins
      humkins about 8 years
      Hello Jarrod, there is no swap partition. Please see "UPDATE" section of the question.
  • humkins
    humkins about 8 years
    Hello, Thomas, I've updated my question with lvdisplay info but it hasn't helped me to understand why df reports that there is much less free space than it must be (-24GB).
  • humkins
    humkins about 8 years
    Hello, Jonas! First of all thank you for your efforts in doing this research! Though I've accepted your answer there is still some gap (I've updated my question with tune2fs info). Reserved blocks: 2897612 * 4096 = ~11G (expected 218-(66+142)=10G) - OK. Inodes: 14491648 * 256 = ~3.5G (expected 232-218=14G) - NOT OK, there is still no 10.5G. But I'm sure tune2fs output has information explaining that. I'll try to analyze it more closely later.
  • Jonas Schäfer
    Jonas Schäfer about 8 years
    @gumkins You mentioned that you resized the root LVM. Did you also run resize2fs? It is safe to run resize2fs /dev/mapper/mint--vg-root, it will detect the size of the volume and act accordingly (i.e. if you did that in the past, it will just tell you "Nothing to do", otherwise it will grow the ext4 to the volumes size). Growing a ext4 filesystem works inplace and online.
  • Jonas Schäfer
    Jonas Schäfer about 8 years
    @gumkins Or see whether your Block Count times Block Size is approximately equal to the size of the logical volume you are using. Here, it is equal up to 4 kiB (which I’d attribute to LUKS and which matches the Payload Offset value in the LUKS header). If the Block Count times Block Size is not equal and resize2fs does not do anything, I’m really out of ideas, because I’d assume that the Block Count would cover everything the ext4 knows about.
  • Jonas Schäfer
    Jonas Schäfer about 8 years
    @gumkins (Sorry for the comment spam) I only just now realised you now have your tune2fs output in your question. The block count really indicates that ~10 GiB (it evaluates to ~221 GiB) are missing. Definitely try resize2fs.
  • humkins
    humkins about 8 years
    You were right! Please see UPDATE3. Thank you very much!