Is ZFS on Ubuntu 20.04 using a ton of memory?

12,084

Solution 1

ZFS will cache data and metadata so given a lot of free memory this will be used by ZFS. When memory pressure starts to occur (for example, loading programs that require lots of pages) the cached data will be evicted. If you have lots of free memory it will be used as cache until it is required.

You can use the arc_summary tool to see the resources used by the ZFS ARC (adaptive replacement cache)

Solution 2

By default, up to 50% of the system RAM is allocated to ZFS on Ubuntu 20.04. ZFS releases RAM if needed by other processes but it takes some times. For example, I experimented some freezes on Virtualbox VM's because of that.

If you can't upgrade your RAM, an easy solution is to limit the RAM used by ZFS. First check how much is currently allocated with arc_summary

  • Min size (hard limit) = the minimum allocated
  • Max size (high water) = the maximum allocated

Values are expressed in bits but you can convert them at http://www.matisse.net/bitcalc/?input_amount=3&input_units=gigabits&notation=legacy

Then:

  • sudo nano /etc/modprobe.d/zfs.conf (it's normal if the file doesn't exist yet)
  • add options zfs zfs_arc_max=3221225472 (to set a 3 gigabits limit for example)
  • save and exit
  • sudo update-initramfs -u
  • sudo reboot

Hope it helps!

Solution 3

ZFS's cache called ARC is being counted as application memory (green bar in htop),
instead of cache (yellow bar in htop).

This is a known bug https://github.com/openzfs/zfs/issues/10255
caused by some difficulties in early implementation.

Even though it looks like application memory, it behaves like a cache
(it will go down in case of memory pressure),
so there is no need to worry about it.

Share:
12,084

Related videos on Youtube

Thomas Browne
Author by

Thomas Browne

Transport, factorization, visualization of high dimensional real time streaming data across disciplines, with focus on finance and cryptocurrency APIs. Ex emerging markets bond trader, PM, strategist, with comprehensive at-the-coalface knowledge of all cash, option, swap, and FX markets, and now crypto! Also: full-stack data engineer from Linux through Postgres, Kafka, messaging protocols, API expert, comprehensive Python / R / Numpy, visualization libraries, Elixir, soon...Rust GPU programming. Get in touch!

Updated on September 18, 2022

Comments

  • Thomas Browne
    Thomas Browne over 1 year

    I have 64GB installed but htop shows 20GB in use:

    enter image description here

    Running ps aux | awk '{print $6/1024 " MB\t\t" $11}' | sort -n gives me the largest processes using only a few 100 megabytes and adding up the whole output only gets me to 2.8GB (ps aux | awk '{print $6/1024}' | paste -s -d+ - | bc). This is more or less what I was getting with Ubuntu 19.04 from which I upgraded yesterday - 3 to 4GB used when no applications are running. So why is 20GB in use on htop?

    Now it's true that I have installed ZFS (total 1.5 GB of SSD drives, in 3 pools, one of which is compressed), and I've been moving some pretty big files around so I could understand if there was some cache allocation. The htop Mem bar is mostly green which means "memory in use" as opposed to buffer (blue) or cache (orange) so it's quite concerning.

    Is this ZFS using up a lot of RAM, and if so, will it release some if other applications need it?

    EDIT

    Here is the output of smem:

    tbrowne@RyVe:~$ smem -tw
    Area                           Used      Cache   Noncache 
    firmware/hardware                 0          0          0 
    kernel image                      0          0          0 
    kernel dynamic memory      20762532     435044   20327488 
    userspace memory            2290448     519736    1770712 
    free memory                42823220   42823220          0 
    ----------------------------------------------------------
                               65876200   43778000   22098200 
    

    So it's "kernel dynamic memory" that's the culprit. Why so much?

    EDIT 2 --> seems to be linked to huge file creation

    I rebooted and RAM usage was circa 5GB. Even running a bunch of tags in firefox, running a few VMs, taking RAM up to 20GB, then closing all the applications it dropped back to 5GB. Then I created a big file in Python (1.8G of random number CSV), then concatenated that to itself 40x to produce a 72GB file:

    tbrowne@RyVe:~$ python3
    Python 3.8.2 (default, Mar 13 2020, 10:14:16) 
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import numpy as np
    >>> import pandas as pd
    >>> pd.DataFrame(np.random.rand(10000, 10000)).to_csv("bigrand.csv")
    >>> quit()
    tbrowne@RyVe:~$ for i in {1..40}; do cat bigrand.csv >> biggest.csv; done
    

    Now after it's all done, and nothing running on the machine, I have 34G in use by the kernel!

    enter image description here

    FINAL EDIT (to test the answer)

    This python 3 script (you need to pip3 install numpy) will allocate about 1GB at a time until it fails. And as per the answer below, as soon as you run it, kernel memory gets freed so I was able to allocate 64 GB before it got killed (I have very little swap). In other words it confirms that ZFS will release memory when it's needed.

    import numpy as np
    xx = np.random.rand(10000, 12500)
    import sys
    sys.getsizeof(xx)
    # 1000000112
    # that's about 1 GB
    ll = []
    index = 1
    while True:
        print(index)
        ll.append(np.random.rand(10000, 12500))
        index = index + 1
    
    • Josh
      Josh about 4 years
      The answer to β€œis ZFS (anywhere) using a lot of memory?” Is always a very bold yes πŸ™‚
    • Wilf
      Wilf about 4 years
      It is pretty much inherent to ZFS yep.... sounds like linuxatemyram.com otherwise (in that case more normal disk caching, which is freed up if needed relatively quickly). I have multiple mentions of a 'rule of thumb' of 1GB per terabyte added on top of base memory, which then multiplies when using features
    • Michael Hampton
      Michael Hampton about 4 years
      Yes, ZFS's usage always shows up as memory used, rather than in buffers/cache. It's a bit confusing.
    • Thomas Browne
      Thomas Browne about 4 years
      @MichaelHampton I'm not a kernel programmer, and I'm going to assume this wouldn't be easy, but really ZFS should be showing its memory as cache or buffer as you suggest. Ubuntu is supposed to be "Linux for humans" ie for the masses. And even though it's "experimental" let's be honest ZFS on Ubuntu is fantastic, easy to find in the installer, and is going to be used by a large proportion of said masses, who will then go on to be confused about what's happening on htop, potentially hurting the Ubuntu and ZFS reputation. It's almost worth opening a github issue on which I might do.
    • Michael Hampton
      Michael Hampton about 4 years
      I imagine that if it were easy it would already have been done. But ZFS on Linux has been around for a decade now, and it hasn't been done.
  • Thomas Browne
    Thomas Browne about 4 years
    Very interesting indeed. ARC size (current) from the arc_summary ouptut is showing min size 2.0 GiB, Max size 31.4 GiB. And my Target size (adaptive) is showing the same 31.4GB. Look I'm very happy for it to be using as much of my big chunk of free ram as it wants to speed things up - that's a positive, as long as I know it will give it back later!. So yes this arc_summary tool answers my question. I was panicking for a few hours that I had pushed the envelope too far in using "exerimental" ZFS everywhere including boot. That said I've already merged two SSDs into a pool and I love it!
  • Austin Hemmelgarn
    Austin Hemmelgarn about 4 years
    Note that this isn't actually much different from normal filesystems. The discrepancy lies in the fact that ZFS uses it's own cache and not the one provided b the Linux VFS layer, which means that it gets reported in a different way in most tools on Linux.
  • Thomas Browne
    Thomas Browne about 4 years
    @AustinHemmelgarn noted and unsurprising. That said as ZFS usage becomes widespread via this Ubuntu LTS version, its reputation could be harmed as it risks being labelled a RAM hog, which is why I've opened this issue: github.com/openzfs/zfs/issues/10251
  • Austin Hemmelgarn
    Austin Hemmelgarn about 4 years
    @ThomasBrowne Oh, it's already got a reputation for hogging RAM because of people not looking into why the numbers are the way they are (and of course actually needing very large amounts of RAM in certain configurations). Unfortunately, this isn't something that can realistically be resolved, because doing so would require ZFS on Linux being merged into the mainline kernel repository, which cannot legally be done for licensing reasons (the CDDL is not GPL compatible).
  • Thomas Browne
    Thomas Browne about 4 years
    @AustinHemmelgarn Wouldn't it be great if Larry saw the light. Seems Nvidia is finally shifting recently, maybe, just maybe, Microsoft doing so (ish) also. Oracle might gift us something? Here's hoping.
  • myaut
    myaut about 4 years
    This is incorrect. You're speaking about page cache which is common for Unix filesystems. Even htop from OPs screenshot recognizes that it is easily reclaimable and draws it with brown bars. Except ZFS which implements its own caching mechanism called ARC cache which uses kernel memory directly (so it is seen as green bars) and reacts to VM pressure (requests from OS/application to reclaim memory) differently.
  • Flaiming
    Flaiming almost 3 years
    Tried that and did not helped. I have 32GB RAM, set zfs_arc_max to 8GB and after 2 days zsysd still ate entire RAM and swap.