Why drop caches in Linux?

62,000

Solution 1

You are 100% correct. It is not a good practice to free up RAM. This is likely an example of cargo cult system administration.

Solution 2

Yes, clearing cache will free RAM, but it causes the kernel to look for files on the disk rather than in the cache which can cause performance issues.

Normally the kernel will clear the cache when the available RAM is depleted. It frequently writes dirtied content to disk using pdflush.

Solution 3

The reason to drop caches like this is for benchmarking disk performance, and is the only reason it exists.

When running an I/O-intensive benchmark, you want to be sure that the various settings you try are all actually doing disk I/O, so Linux allows you to drop caches rather than do a full reboot.

To quote from the documentation:

This file is not a means to control the growth of the various kernel caches (inodes, dentries, pagecache, etc...) These objects are automatically reclaimed by the kernel when memory is needed elsewhere on the system.

Use of this file can cause performance problems. Since it discards cached objects, it may cost a significant amount of I/O and CPU to recreate the dropped objects, especially if they were under heavy use. Because of this, use outside of a testing or debugging environment is not recommended.

Solution 4

The basic idea here is probably not that bad (just very naive and misleading): There may be files being cached, that are very unlikely to be accessed in the near future, for example logfiles. These "eat up" ram, that will later have to be freed when necessary by the OS in one or another way.

Depending on your settings of swappiness, file access pattern, memory allocation pattern and many more unpredictable things, it may happen that when you don't free these caches, they will later be forced to be reused, which takes a little bit more time than allocating memory from the pool of unused memory. In the worst case the swappiness settings of linux will cause program memory to be swapped out, because linux thinks those files may be more likely to be used in the near future than the program memory.

In my environment, linux guesses quite often wrong, and at the start of most europe stock exchanges (around 0900 local time) servers will start doing things that they do only once per day, needing to swap in memory that was previously swapped out because writing logfiles, compressing them, copying them etc. was filling up cache to the point where things had to be swapped out.

But is dropping caches the solution to this problem? definetly not. What would be the solution here is to tell linux what it doesn't know: that these files will likely not be used anymore. This can be done by the writing application using things like posix_fadvise() or using a cmd line tool like vmtouch (which can also be used to look into things as well as cache files).

That way you can remove the data that is not needed anymore from the caches, and keep the stuff that should be cached, because when you drop all caches, a lot of stuff has to be reread from disk. And that at the worst possible moment: when it is needed; causing delays in your application that are noticeable and often unacceptable.

What you should have in place is a system that monitors your memory usage patterns (e.g. if something is swapping) and then analyze accordingly, and act accordingly. The solution might be to evict some big files at the end of the day using vtouch; it might also be to add more ram because the daily peak usage of the server is just that.

Solution 5

I have seen drop caches to be useful when starting up a bunch of virtual machines. Or anything else that uses Huge Pages such as some database servers.

Huge Pages in Linux often need to defrag RAM in order to find 2MB of contiguous physical RAM to put into a page. Freeing all of the file cache makes this process very easy.

But I agree with most of the other answers in that there is not a generally good reason to drop the file cache every night.

Share:
62,000

Related videos on Youtube

ivcode
Author by

ivcode

Updated on September 18, 2022

Comments

  • ivcode
    ivcode over 1 year

    In our servers we have a habit of dropping caches at midnight.

    sync; echo 3 > /proc/sys/vm/drop_caches
    

    When I run the code it seems to free up lots of RAM, but do I really need to do that. Isn't free RAM a waste?

    • Michael Hampton
      Michael Hampton almost 10 years
      Find the person who put this in and ask him why he did it. As you correctly guessed, there is no obvious good reason for it.
    • ivcode
      ivcode almost 10 years
      That person is no longer employed. So I can't ask him. When I ask others, they said it is good practice to free up ram but I don't see the point. What are the cases that I should use above code anyway?
    • Michael Hampton
      Michael Hampton almost 10 years
      Debugging the kernel. That's about it. This doesn't actually free up any RAM; it drops caches, as the name suggests, and thus reduces performance.
    • ivcode
      ivcode almost 10 years
      BTW we have a server on VMware that don't have lot of memory and we have a cronjob monitoring it's ram with vmstat 2 3|tail -1|awk '{print $4}' when the value reduces more than some amount it drops caches otherwise server will hang
    • ivcode
      ivcode almost 10 years
      Thank you so much @David for your clear explanations. This made me taking the matter to the software developer rather than finding quick fixes
    • EkriirkE
      EkriirkE almost 10 years
      I can guess only that perhaps it was a measure to cut data losses, maybe because of frequent crashing/panics/power loss
    • Drunix
      Drunix almost 10 years
      Related thedailywtf.com/Articles/Modern-Memory-Management.aspx Strongly arguing it's a bad idea.
    • Ruslan
      Ruslan almost 10 years
      @EkriirkE to cut data losses only sync would be sufficient, dropping caches is a no-op for this purpose.
    • Bill Weiss
      Bill Weiss almost 10 years
      Related, and a useful description of the "problem": linuxatemyram.com
    • Max
      Max almost 10 years
      sudo killall -r .* also frees a lot of memory
    • Colin Pickard
      Colin Pickard almost 10 years
      perhaps the person who put that in is Patrick R: serverfault.com/questions/105606/deleting-linux-cached-ram
    • Nathan C
      Nathan C almost 10 years
      @Max sudo killall -s KILL -r .* ;)
    • Scott Leadley
      Scott Leadley almost 10 years
      It's probably "system guano". The person who put it there may not remember why it's there, or if it works, or why it works if it works. Maybe nobody knows why it's there. It remains because "if it works, don't break it". In systems with poor configuration control this crap accumulates. The long-term answer is to improve configuration/change/revision management for your systems. A configuration management system like CFEngine, Chef or Puppet won't stop you from doing some stupid things, but you'll have to be consistently stupid, which (we hope) is more likely to be caught and dealt with.
  • Tonny
    Tonny almost 10 years
    +1 for mentioning Cargo Cult System Administration. Any sysadmin who doesn't know that term and what it means should be fired.
  • Ogre Psalm33
    Ogre Psalm33 almost 10 years
    +1 for explaining why it's a bad idea.
  • PlasmaHH
    PlasmaHH almost 10 years
    @Tonny: We would be left without sysadmin department then :(
  • Tonny
    Tonny almost 10 years
    @PlasmaHH Unfortunately that is the situation most us find themselves in... The one thing worse than a sysadmin who doesn't know about CargoCult is an ICT manager who doesn't know AND is who runs his department like CargoCult. I worked for such a one once upon a time. I left that place after 3 months and I swore NEVER to do that again. (They went bankrupt 7 months later... Partly due to ICT mismanagement breaking their Sales system beyond repair.)
  • ivcode
    ivcode almost 10 years
    All the apps on my server is running on nohup. Maybe nohup.out is being cached and eating up memory?
  • PlasmaHH
    PlasmaHH almost 10 years
    @ivcode: This could be a reason, check how big nohup.out is. Maybe use vmtouch to figure out how much of it is cached.
  • ivcode
    ivcode almost 10 years
    I have a cron job to cat /dev/null > path/nohup.out in every 15 minutes as nohup.out is growing rapidly. Maybe linux is caching nohup.out even if I'm clearing it
  • David Wilkins
    David Wilkins almost 10 years
    @ivcode If you don't need the output from nohup you should re-direct it to /dev/null. It sounds like you had some very inexperienced sysadmins working on your systems at some point. See stackoverflow.com/questions/10408816/… for how to direct nohup's output to /dev/null
  • user
    user almost 10 years
    Of course, depending on what you are trying to do, even a full reboot might not sufficiently clear the disk cache.
  • ivcode
    ivcode almost 10 years
    although nohup.out is cleared in 15 min intervals, if apps process got killed for some reason, nohup.out will be automatically backedup from another script. i tried vmtouch. it's a very good tool indeed
  • Noah Spurrier
    Noah Spurrier almost 10 years
    I upvoted for pointing out second order prejudice is responses to drop caches.
  • Michael Hampton
    Michael Hampton almost 10 years
    This is a prime example of the "cargo cult" system administration: rather than locating and solving the problem, you are simply masking it.
  • Aaron Hall
    Aaron Hall almost 10 years
    Like most of humanity, I love terse brash assertions with lots of approval, but a cite or reasoning would earn my superego's +1.
  • Aaron Hall
    Aaron Hall almost 10 years
    Explain the cargo-cult administration, as well as the above, if you don't mind. Maybe in a follow-on edit? I'm still withholding my +1... :P
  • qris
    qris almost 10 years
    @Tonny "I left that place after 3 months and I swore NEVER to do that again" sounds like cargo-cult employer selection to me :)
  • Tonny
    Tonny almost 10 years
    @qris Yes, maybe. In my defense: It was my first job out of university and I didn't know any better. I was just glad to have a job and initially it looked good. It was a very frustrating experience at the time, but in hindsight it was highly educational.
  • Dan Pritts
    Dan Pritts over 9 years
    Sometimes the expedient solution is the right one. It might just be putting off resolving the real problem, or it might be as much solution as is required in the circumstances. Even if it's bad practice, it's still not "cargo cult." There's a demonstrated cause and effect: drop caches and disk performance improves.
  • Dan Pritts
    Dan Pritts over 9 years
    "these objects are automatically reclaimed by the kernel when memory is needed" is the design goal but it might not always be the actual behavior.
  • Dan Pritts
    Dan Pritts over 9 years
    +1 for a more in-depth explanation.
  • gparent
    gparent over 9 years
    Do you have any source for this? This sounds like something that should be fixed in the kernel if it's such an issue.
  • Dan Pritts
    Dan Pritts over 9 years
    I have personal experience with the pauses with transparent hugepages. RHEL6, Dell R810, 4CPUs, 64GB RAM. Disabling transparent hugepages (there's a /proc file to do so) immediately fixed the pauses. I didn't try the cache drop technique at the time; instead I reconfigured our java apps to use non-transparent hugepages, and left transparent hugepages disabled. IIRC, we looked into the situation enough to realize that we weren't the only people affected, and that Red Hat knew about the issue.
  • Joe
    Joe over 9 years
    @DanPritts What precisely makes you think it's not so?
  • Dan Pritts
    Dan Pritts over 9 years
    The obvious case is when you want to clear out RAM to allow the allocation of more (non-trnsparent) hugepages; another case is transparent hugepage garbage collection pause bugs (see my answer/comments elsewhere on this question). But my comment was intended for the general case. Sometimes the people who are operating the system know better than the people who designed/implemented it. Often, not - that's what their comment is trying to protect against. I'm just glad that the
  • mirabilos
    mirabilos almost 9 years
    @Ben see this thread (this message and a couple of followups, one of which includes a guess where it could come from)
  • underscore_d
    underscore_d over 8 years
    Part of the original definition of CCSA was a tendency to mistake correlation for causation, and here we are. Masking a problem by addressing a correlated but not causal entity is suboptimal problem-solving, which is what the concept of CCSA is trying to warn against.
  • Fernando
    Fernando over 8 years
    I'm experiencing a similar issue ( although it's x86_64 ) and the only solution at this moment is to drop caches serverfault.com/questions/740790/…
  • mirabilos
    mirabilos over 8 years
    @Fernando I have a “drop caches” cronjob on the m68k box as well ☹
  • Dan Pritts
    Dan Pritts about 8 years
    SUSE recommends this method to try to deal with memory pressure. suse.com/communities/blog/…
  • David Schwartz
    David Schwartz about 8 years
    @DanPritts That's pretty depressing. Even more depressing, they don't explain what the "issue" is that they claim it deals with.
  • Dan Pritts
    Dan Pritts about 8 years
    "its possible that though your application may not be using these RAM but Linux is caching aggressively into its memory and even though the application needs memory it wont free some of these cache but would rather start swapping." Not very specific. In practice, memory management isn't perfect, and having a knob to turn when that imperfection shows up is a good thing.
  • David Schwartz
    David Schwartz about 8 years
    @DanPritts That's not an imperfection. That's a huge win. That way, if you do run into memory pressure, you don't have to write out pages then, you can just discard them.
  • Dan Pritts
    Dan Pritts about 8 years
    Caching aggressively is a win. Caching so aggressively that your application starts to swap...not so much.
  • Dan Pritts
    Dan Pritts over 7 years
    access.redhat.com/solutions/46111 describes it. You can disable transparent hugepages to see if that is the problem in your case.
  • user1649948
    user1649948 about 7 years
    Also, in HPC applications on high-memory nodes (1Tb), reading in a few large files results in a large amount of memory cached. Because many HPC applications perform malloc's of hundreds of GB, the system can stall for hours as migration processes move tiny chunks of fragmented memory fruitlessly across NUMA nodes once the system reaches the cached memory "border". Worse, nothing you can do in userland to free the caches except trick the system into allocating all the tiny 2MB blocks it can at once then releasing, letting hugepaged defrag and the apps run normally.
  • Aleksandr Dubinsky
    Aleksandr Dubinsky almost 7 years
    +1 The command to create large pages (sysctl -w vm.nr_hugepages=...) refuses to even work unless I first drop caches (Arch linux).
  • Motivated
    Motivated over 4 years
    @ananthan - A post on rsync suggests dropping caches - unix.stackexchange.com/a/510800
  • krad
    krad about 4 years
    The world is more dirty now, how does all of this sit in the hyper converged world where you have ballooning vms sat ontop of heavily cached file systems and block storage?
  • P.Péter
    P.Péter about 4 years
    @Motivated And it makes some sense if you do not trust your memory fully (i.e. non-ecc RAM may have a flipped bit in the cache segments), but not for speeding up things, but to minimize the chance that memory errors change your rsync results. On a server with ecc memory, the chances of that happening is so astronomically low that you should not bother.
  • Alex G
    Alex G almost 3 years
    Try using ubuntu and see how nice it is for you ram cache to be at 20GB and have application hang because there is not enough free ram.
  • David Schwartz
    David Schwartz about 2 years
    @DanPritts That's incorrect. The earlier you start swapping the better because earlier on, the extra I/O has no effect on performance because you aren't I/O limited. By the time you actually need to swap, you are I/O limited. So it's a huge win to have written stuff out already so you can discard it from RAM without having to write it out when I/O is precious.
  • Dan Pritts
    Dan Pritts about 2 years
    Interesting point - having it written to swap but still "cached swap" is certainly reasonable. That isn't what I meant, though. An imbalance between application memory and disk cache is a bad thing, as you surely understand.