Why is MemAvailable a lot less than MemFree+Buffers+Cached?

7,276

Solution 1

Are there any tools I can run to further investigate why this happens?

The discrepancy could be because you are using the wrong calculation. The answer you linked to does not highlight this, but look at the linked commit message:

[People] generally do this by adding up "free" and "cached", which was fine ten years ago, but is pretty much guaranteed to be wrong today. It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs.

The part of Cached which is not freeable as page cache (sigh), is counted as Shmem in /proc/meminfo.

You can also run free, and look in the "shared" column.

Often this is caused by a mounted tmpfs. Check df -h -t tmpfs.

Solution 2

As you saw in the article you references, the whole set of calculations around MemAvailable is built around calculating how much memory is free to use without causing any swapping. You can see in the actual patch that implemented the MemAvailable number that MemAvailable = MemFree - LowWaterMark + (PageCache - min(PageCache / 2, LowWaterMark))

This formula points to the probability that your system's MemAvailable is low because your low water mark, the amount of free memory your system thinks it needs as its working space, is likely very high. This makes sense in a swapless environment where the system is much more concerned about running out of memory. You can look at what your current low watermark is:

 $ cat /proc/sys/vm/min_free_kbytes

I suspect in your case this is quite high.

Almost all the heuristics in Linux's memory management assume you will be operating with some swap space.

Share:
7,276

Related videos on Youtube

Mikko Rantalainen
Author by

Mikko Rantalainen

My daily work is a closed source PHP project but I'm really interested in open source projects and I know PHP, C/C++, JavaScript and Perl 5 pretty well. I can do some Java, Python, x86 assembler (intel syntax) and some other programming languages, too. Purely functional languages such as Haskell are still a bit hard for me. I can do some linux kernel programming, too. I'm currently running Ubuntu (workstation, home computer, laptop) and LineageOS (phone) as my OS of choice. GPG: 563168EB

Updated on September 18, 2022

Comments

  • Mikko Rantalainen
    Mikko Rantalainen over 1 year

    I'm running a Linux workstation without swap and I have installed earlyoom daemon to automatically kill some processes if I'm running out of RAM. The earlyoom works by monitoring kernel MemAvailable value and if the available memory gets low enough, it kills less important processes.

    This has worked fine for a long time but suddenly I'm now running into situation where MemAvailable is suddenly really low compared to the rest of the system. For example:

    $ grep -E '^(MemTotal|MemFree|MemAvailable|Buffers|Cached):' /proc/meminfo 
    MemTotal:       32362500 kB
    MemFree:         5983300 kB
    MemAvailable:    2141000 kB
    Buffers:          665208 kB
    Cached:          4228632 kB
    

    Note how MemAvailable is much lower than MemFree+Buffers+Cached.

    Are there any tools I can run to further investigate why this happens? I feel that the system performance is a bit worse than normally and I had to stop the earlyoom service because its logic will not work unless MemAvailable is stable (that is, it correctly describes the available memory to user mode processes).

    According to https://superuser.com/a/980821/100154 MemAvailable is an estimate of how much memory is available for starting new applications, without swapping. As I have no swap, what is this supposed to mean? Is this supposed to mean the amount of memory a new process can acquire before OOM Killer is triggered (because that would logically hit "the swap is full" situation)?

    I had assumed that MemAvailable >= MemFree would be always true. Not here.

    Additional info:

    Searching around the internet suggests that the cause may be open files that are not backed by the filesystem and as a result, cannot be freed from the memory. The command sudo lsof | wc -l outputs 653100 so I definitely cannot manually go through that list.

    The top of the sudo slabtop says

     Active / Total Objects (% used)    : 10323895 / 10898372 (94.7%)
     Active / Total Slabs (% used)      : 404046 / 404046 (100.0%)
     Active / Total Caches (% used)     : 104 / 136 (76.5%)
     Active / Total Size (% used)       : 6213407.66K / 6293208.07K (98.7%)
     Minimum / Average / Maximum Object : 0.01K / 0.58K / 23.88K
    
      OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
    4593690 4593656  99%    1.06K 153123       30   4899936K ext4_inode_cache
    3833235 3828157  99%    0.19K 182535       21    730140K dentry
    860224 551785  64%    0.06K  13441       64     53764K kmalloc-64
    515688 510872  99%    0.66K  21487       24    343792K proc_inode_cache
    168140 123577  73%    0.20K   8407       20     33628K vm_area_struct
    136832 108023  78%    0.06K   2138       64      8552K pid
    ...
    

    which looks normal to me.

    Creating a rough summary of lsof

    $ sudo lsof | awk '{ print $2 }' | sort | uniq -c | sort -h | tail
       6516 1118
       7194 2603
       7884 18727
       8673 19951
      25193 28026
      29637 31798
      38631 15482
      41067 3684
      46800 3626
      75744 17776
    

    points to me PID 17776 which is a VirtualBox instance. (Other processes with lots of open files are Chrome, Opera and Thunderbird.) So I wouldn't be overly surprised to later figure out that the major cause of this problem is VirtualBox because that's the only thing that really messes with the kernel.

    However, the problem does not go away even if I shutdown virtualbox and kill Chrome, Opera and Thunderbird.

    • Mikko Rantalainen
      Mikko Rantalainen over 5 years
      I originally asked this at Stack overflow (stackoverflow.com/q/52739906) but was pointed that I should use serverfault or super user instead.
    • Mikko Rantalainen
      Mikko Rantalainen almost 4 years
      Note that with modern linux kernel, the Cached contains all shared memory, too. For example, if you're running Postgres with huge shared_buffer setting, it will be included in Cached count even though the kernel cannot re-use that memory without killing all the Postgres processes first.
    • Mikko Rantalainen
      Mikko Rantalainen almost 4 years
      Always check Committed_AS in /proc/meminfo, too. If your committed memory is very high, the kernel will start emitting MemAvailable near zero even if you currenty have high MemFree, too. This is because technically the kernel is running on COW implementation instead of actually having all the memory it has promised to user mode processes and the kernel code assumes that those already running will start modifying the memory they have in the future so available memory is considered gone.
    • Kyle Hailey
      Kyle Hailey over 2 years
      The strange part is that MemAvailable is so much less than MemFree. MemAvailable at it's lowest should be MemFree - 200MB or > 5GB where as it is about 2GB
    • Mikko Rantalainen
      Mikko Rantalainen over 2 years
      I nowadays assume that the real problem was high memory fragmentation and zero swap. Especially older Linux kernels (pre 4.0) had major problems automatically defragmenting memory without swap. The memory fragmentation can be inspected via /proc/buddyinfo. The MemAvailable may be low in these cases because kernel "knows" that it cannot directly use the free memory that's highly fragmented.
  • Mikko Rantalainen
    Mikko Rantalainen over 5 years
    min_free_kbytes is 200000 or about 200 MB. Note that even MemFree is 3.8 GB higher than MemAvailable. In addition, the system has nearly 5 GB in buffers and cache. I have since restarted the workstation and added 2 GB of swap and the issue still continues. Even now I have MemFree 1 GB higher than MemAvailable. I think the system behavior has changed between kernel version 4.13 and 4.15 so this might be either kernel regression or MemAvailable no longer means the same thing it used to mean.
  • Mikko Rantalainen
    Mikko Rantalainen over 5 years
    Thinking this problem a bit more, I think one change that might have triggered the difference is that I used to run VirtualBox 5.1.x and I have upgraded to VirtualBox 5.2.x series. VirtualBox adds custom kernel module which could cause this issue at least in theory.
  • Mikko Rantalainen
    Mikko Rantalainen about 5 years
    I'm currently running Virtualbox 6.0.4 with kernel 4.15.0-46-lowlatency and the problem has not re-appeared even though the system uptime is around 40 days. I guess the problem was caused by Virtualbox or kernel bug.
  • Mikko Rantalainen
    Mikko Rantalainen over 2 years
    Another thing to investigate for similar issues is to check /proc/buddyinfo. Sometimes the memory is highly fragmented and kernel will reflect this in MemAvailable because it often cannot directly use the free RAM without defragmenting it first.