How do I tell what process is causing kswapd to be in use?

100,797

Solution 1

kswapd is managing swap space in response to memory demands greater than physically available for all processes.

It is process agnostic, it is only interested in what pages are access and when (it is more complex than this of course but to keep things simple we may as well view it this way).

So the real question is "what processes have the greatest burden on memory that are causing kswapd to need to page all the time".

That is most easily answered using 'top' and switching to memory usage sort mode.

Solution 2

If you're on Ubuntu 15.10 or greater, this may actually be the result of a bug, especially if your system is a virtual machine lacking a swap partition (e.g., AWS EC2). The problem exists on other distributions, but, as of writing, it's unclear if the same fix works universally.

A temporary workaround:

sudo ln -s /dev/null /etc/udev/rules.d/40-vm-hotadd.rules
sudo reboot

Note that this will disable hotadding RAM/CPUs for Xen and Hyper-V virtual machines.

Solution 3

You can script it.. but you can also do it via top

Run top then press O followed by p then enter

Now all the processes are sorted by swap usage and you can see which ones are using it

Solution 4

There also seems to be a bug in kswapd somewhere, hopefully only on older kernels.

Nearly each day now kswapd goes beserk randomly on some machines in a bigger cluster (with a non-current kernel, though). 100% CPU on both kswapd processes. No other running processes (except ssh shell), plenty of free RAM (more than 700 MB) and no SWAP used at all. No swapin, no swapout as well.

Nothing explains yet, why a particular machine is hit and another is not. It seems not to be completely random, because it usually hits more than one machine within a short time span. It looks like machines, which are idle, as well as machines, which are under high pressure, are less(!) likely hit by the effect. So it has to do something with the work load and only hits if the machine is neither idle nor very busy.

If the problem strikes nothing helps anymore. Killing all processes (which did not become unkillable), unmounting all filesystems, nothing. kswapd still stays at 100% CPU. I suspect some spinlock race in SMP kernels, but it's also likely that I am wrong.

Perhaps see my answer serverfault.com/questions/316995/#493257

Notes:

  • Rebooting affected machines often fails because the shutdown process starts hanging somewhere.
  • There is no direct connection to the Internet. Foreign causes are unlikely.
  • It seems to depend on the type of workload the machines processes from a load's perspective, because we have machines which never were affected (yet).
  • Sorry, I cannot be more specific on what we do and why.
  • Yes, I am speculating. Because it's an extremely puzzling effect, today.
Share:
100,797

Related videos on Youtube

Deshawn
Author by

Deshawn

Updated on September 18, 2022

Comments

  • Deshawn
    Deshawn over 1 year

    I see kswapd using 100% CPU... how can I tell on which process's behalf kswapd is being used so much?

  • Deshawn
    Deshawn over 12 years
    Thanks!. Doe skswapd kick in ONLY when the actual pages touched exceeds physical or does it kick in even though a process has allocated the memory or mapped the SHM region but not used it? That is, is it only when the problem happens or does it do book keeping and swap things in and out even though there is physical memory available but just because some process has been idle etc?
  • Paul
    Paul over 12 years
    As I understand it, kswapd will under normal circumstances remove any pages from main memory that don't need to be there, because any page that is freed is one that can be used for caching or other processes. Ie, it is better to have an old unused page already on disk rather than to incur the slowness cost of moving it in response to a request for memory from another process.
  • Zaz
    Zaz almost 9 years
    Even if a machine needs to use a lot of swap space, it shouldn't take 100% CPU to do it. Something is odd.
  • Paul
    Paul almost 9 years
    @Zaz It isn't so much that it is using CPU processing power to do swapping, it is that the CPU is 100% used due to IOWAIT. Each time memory needs to be swapped in from disk, the CPU has to sit there and wait for it - IOWAIT, and isn't doing anything else (on average).
  • Zaz
    Zaz almost 9 years
    @Paul: Are you sure? top is telling me that no time is being spent in IO wait, and almost 100% time is being spent in system. More info: kswapd often uses 100% CPU when swap is in use
  • Paul
    Paul almost 9 years
    @Zaz I am sure that kswapd when using swap leads to high apparent CPU through IOWAIT. It looks like you are having a different problem however.
  • Tino
    Tino about 8 years
    This is historic. RedHat confirmed: It was an issue of kernel 2.6.18-194.el5 in combination with NFS client. It was fixed in 2012 already. See the linked answer in my text for a little more information. If you hit this today, it is likely some other cause.
  • trueCamelType
    trueCamelType almost 8 years
    This is still a problem in some places. I've seen tons of these pop up. here, and here are some examples.
  • jeteon
    jeteon over 7 years
    Had this come out of nowhere on my system on Kubuntu 16.10 with the workaround already enabled a while ago.
  • Zenexer
    Zenexer over 7 years
    @jeteon There are multiple issues that can cause this behavior; this just happens to be a particularly common one.
  • jeteon
    jeteon over 7 years
    Yeah. I've found that echo 3 > /proc/sys/vm/drop_caches alleviates it once it starts happening. I pre-emptively have the command on a cron job now and it seems to help, or at least limit the duration of the OOM massacre when I'm away from the computer.
  • Shadow
    Shadow over 5 years
    O brings up filter options for me, pressing p then enter gives me "'include' filter delimiter is missing"