How to test oom-killer from command line

9,525

Solution 1

The key to triggering the OOM killer quickly is to avoid getting bogged down by disk accesses. So:

  1. Avoid swapping, unless your goal is specifically to test the behavior of OOM when swap is used. You can disable swap before the test, then re-enable it afterwards. swapon -s tells you what swaps are currently enabled. sudo swapoff -a disables all swaps; sudo swapon -a is usually sufficient to reenable them.

  2. Avoid interspersing memory accesses with non-swap disk accesses. That globbing-based method eventually uses up your available memory (given enough entries in your filesystem), but the reason it needs so much memory is to store information that it obtains by accessing your filesystem. Even with an SSD, it's likely that much of the time is spent reading from disk, even if swap is turned off. If your goal is specifically to test OOM behavior for memory accesses that are interspersed with disk accesses, that method is reasonable, perhaps even ideal. Otherwise, you can achieve your goal much faster.

Once you've disabled swap, any method that seldom reads from a physical disk should be quite fast. This includes the tail /dev/zero (found by falstaff, commented above by Doug Smythies). Although it reads from the character device /dev/zero, that "device" just generates null bytes (i.e., bytes of all zeros) and doesn't involve any physical disk access once the device node has been opened. That method works because tail looks for trailing lines in its input, but a stream of zeros contains no newline character, so it never gets any lines to discard.

If you're looking for a one-liner in an interpreted language that allocates and populates the memory algorithmically, you're in luck. In just about any general-purpose interpreted language, it's easy to allocate lots of memory and write to it without otherwise using it. Here's a Perl one-liner that seems to be about as fast as tail /dev/zero (though I haven't benchmarked it extensively):

perl -wE 'my @xs; for (1..2**20) { push @xs, q{a} x 2**20 }; say scalar @xs;'

With swap turned off on an old machine with 4 GiB of RAM, both that and tail /dev/zero took about ten seconds each time I ran them. Both should still work fine on newer machines with much more RAM than that. You can make that perl command much shorter, if your goal is brevity.

That Perl one-liner repeatedly generates (q{a} x 2**20) separate moderately long strings--about a million characters each--and keeps them all around by storing them in an array (@xs). You can adjust the numbers for testing. If you don't use all available memory, the one-liner outputs the total number of strings created. Assuming the OOM killer does kill perl--with the exact command shown above and no resource quotas to get in the way, I believe in practice it always will--then your shell should show you Killed. Then, as in any OOM situation, dmesg has the details.

Although I like that method, it does illustrate something useful about writing, compiling, and using a C program--like the one in Doug Smythies's answer. Allocating memory and accessing the memory don't feel like separate things in high-level interpreted languages, but in C you can notice and, if you choose, investigate those details.


Finally, you should always check that the OOM killer is actually what killed your program. One way to check is to inspect dmesg. Contrary to popular belief, it is actually possible for an attempt to allocate memory to fail fast, even on Linux. It's easy to make this happen with huge allocations that will obviously fail... but even those can happen unexpectedly. And seemingly reasonable allocations may fail fast. For example, on my test machine, perl -wE 'say length q{a} x 3_100_000_000;' succeeds, and perl -wE 'say length q{a} x 3_200_000_000;' prints:

Out of memory!
panic: fold_constants JMPENV_PUSH returned 2 at -e line 1.

Neither triggered the OOM killer. Speaking more generally:

  • If your program precomputes how much memory is needed and asks for it in a single allocation, the allocation may succeed (and if it does, the OOM killer may or may not kill the program when enough of the memory is used), or the allocation may simply fail.
  • Expanding an array to enormous length by adding many, many elements to it often triggers the OOM killer in actual practice, but making it do that reliably in testing is surprisingly tricky. The way this is almost always done--because it is the most efficient way to do it--is to make each new buffer with a capacity x times the capacity of the old buffer. Common values for x include 1.5 and 2 (and the technique is often called "table doubling"). This sometimes bridges the gap between how much memory can actually be allocated and used and how much the kernel knows is too much to even bother pretending to hand out.
  • Memory allocations can fail for reasons that have little to do with the kernel or how much memory is actually available, and that doesn't trigger the OOM killer either. In particular, a program may fail fast on an allocation of any size after successfully performing a very large number of tiny allocations. This failure happens in the bookkeeping that is carried out by the program itself--usually through a library facility like malloc(). I suspect this is what happened to me today when, during testing with bash arrays (which are actually implemented as doubly linked lists), bash quit with an error message saying an allocation of 9 bytes failed.

The OOM killer is much easier to trigger accidentally than to trigger intentionally.

In attempting to deliberately trigger the OOM killer, one way around these problems is to start by requesting too much memory, and go gradually smaller, as Doug Smythies's C program does. Another way is to allocate a whole bunch of moderately sized chunks of memory, which is what the Perl one-liner shown above does: none of the millionish-character strings (plus a bit of additional memory usage behind the scenes) is particularly taxing, but taken together, all the one-megabyte purchases add up.

Solution 2

This answer uses a C program to allocate as much memory as possible, then gradually actually uses it, resulting in "Killed" from the OOM protection.

/*****************************************************************************
*
* bla.c 2019.11.11 Smythies
*       attempt to invoke OOM by asking for a rediculous amount of memory
*       see: https://askubuntu.com/questions/1188024/how-to-test-oom-killer-from-command-line
*       still do it slowly, in chunks, so it can be monitored.
*       However simplify the original testm.c, for this example.
*
* testm.cpp 2013.01.06 Smythies
*           added a couple more sleeps, in attempts to observe stuff on linux.
*
* testm.cpp 2010.12.14 Smythies
*           attempt to compile on Ubuntu Linux.
*
* testm.cpp 2009:03:18 Smythies
*           This is not the first edit, but I am just adding the history
*           header.
*           How much memory can this one program ask for and sucessfully get?
*           Done in two calls, to more accurately simulate the program I
*           and wondering about.
*           This edit is a simple change to print the total.
*           the sleep calls have changed (again) for MS C version 2008.
*           Now they are more like they used to be (getting annoying).
*                                                                     Smythies
*****************************************************************************/

#include <stdio.h>
#include <stdlib.h>

#define CR 13

int main(){
   char *fptr;
   long i, k;

   i = 50000000000L;

   do{
      if(( fptr = (char *)malloc(i)) == NULL){
         i = i - 1000;
      }
   }
   while (( fptr == NULL) && (i > 0));

   sleep(15);  /* for time to observe */
   for(k = 0; k < i; k++){   /* so that the memory really gets allocated and not just reserved */
      fptr[k] = (char) (k & 255);
   } /* endfor */
   sleep(60);  /* O.K. now you have 1 minute */
   free(fptr); /* clean up, if we get here */
   return(0);
}

The result:

doug@s15:~/c$ ./bla
Killed
doug@s15:~/c$ journalctl -xe | grep oom
Nov 11 16:08:24 s15 kernel: mysqld invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Nov 11 16:08:25 s15 kernel:  oom_kill_process+0xeb/0x140
Nov 11 16:08:27 s15 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Nov 11 16:08:27 s15 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user/doug/0,task=bla,pid=24349,uid=1000
Nov 11 16:08:27 s15 kernel: Out of memory: Killed process 24349 (bla) total-vm:32638768kB, anon-rss:15430324kB, file-rss:952kB, shmem-rss:0kB, UID:1000 pgtables:61218816kB oom_score_adj:0
Nov 11 16:08:27 s15 kernel: oom_reaper: reaped process 24349 (bla), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

It still takes awhile to run, but on the order of minutes only.
The use of mlock in the C program might help, but I didn't try it.

My test computer is a server, so I use watch -d free -m to monitor progress.

Readers: Messing with OOM is somewhat dangerous. If you read all these answers and comments, you will note some collateral damage and inconsistencies. We can not control when other tasks might ask for a bit more memory, which could well be at just the wrong time. Proceed with caution, and recommend reboot of computer after these type of tests.

Solution 3

In a terminal, type "python"

Then copy and paste this code and press Enter:

var=[]
for x in xrange(99999999999):
    var.append(str(x))

Then do:

"cat /var/log/messages" and you'll find something like:
Nov 12 11:48:05 TestVM kernel: Out of memory: Kill process 1314 (python) score 769 or sacrifice child
Nov 12 11:48:05 TestVM kernel: Killed process 1314 (python) total-vm:1001264kB, anon-rss:802972kB, file-rss:60kB, shmem-rss:0kB
Nov 12 11:48:49 TestVM kernel: python[1337]: segfault at 24 ip 00007f2ad140c0da sp 00007ffee8c11820 error 6 in libpython2.7.so.1.0[7f2ad1382000+17e000]

Solution 4

Revised answer

My initial answer took 1/2 hour to execute and has been dropped in this revision:

ls -d /*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*

I'll accept someone else's answer as a faster way of invoking oom-killer from the command line. As a revised answer I'll explain how to get relevant oom-killer details from journalctl and what they mean.


This more efficient answer by mjoao to use up RAM:

logger --tag="kernel" "Start for oom-killer"; a=""; for b in {0..99999999}; do a=$b$a$a$a$a$a$a; done

The logger command was prepended to give a timestamp in journalctl for when the RAM eating process starts.

After oom-killer is finished, open a new terminal and type oomlog (script contents later on):

$ oomlog
Nov 12 12:29:23 alien kernel[19202]: Start for oom-killer
Nov 12 12:30:02 alien kernel: 31981 total pagecache pages
Nov 12 12:30:02 alien kernel: 11627 pages in swap cache
Nov 12 12:30:02 alien kernel: Swap cache stats: add 10739122, delete 10727632, find 8444277/9983565
Nov 12 12:30:02 alien kernel: Free swap  = 0kB
Nov 12 12:30:02 alien kernel: Total swap = 8252412kB
Nov 12 12:30:02 alien kernel: 2062044 pages RAM
Nov 12 12:30:02 alien kernel: 0 pages HighMem/MovableOnly
Nov 12 12:30:02 alien kernel: 56052 pages reserved
Nov 12 12:30:02 alien kernel: 0 pages cma reserved
Nov 12 12:30:02 alien kernel: 0 pages hwpoisoned
Nov 12 12:30:02 alien kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Nov 12 12:30:02 alien kernel: [ 4358]  1000  4358  2853387  1773446    5578      13  1074744             0 bash
Nov 12 12:30:02 alien kernel: Out of memory: Kill process 4358 (bash) score 701 or sacrifice child
Nov 12 12:30:02 alien kernel: Killed process 4358 (bash) total-vm:11413548kB, anon-rss:7093784kB, file-rss:0kB, shmem-rss:0kB
Nov 12 12:30:03 alien kernel: oom_reaper: reaped process 4358 (bash), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

The better answer takes 30 seconds to use up RAM which is not too fast (like tail /dev/zero) and not too slow (like my original answer).

The oomlog script condenses many pages of journalctl output into 16 lines.

The oom-killer [ pid ] fields are explained here:

  • pid The process ID.
  • uid User ID.
  • tgid Thread group ID.
  • total_vm Virtual memory use (in 4 kB pages)
  • rss Resident memory use (in 4 kB pages)
  • nr_ptes Page table entries
  • swapents Swap entries
  • oom_score_adj Usually 0; a lower number indicates the process will be less likely to die when the OOM killer is invoked.

oomlog bash script

#!/bin/bash

# NAME: oomlog
# PATH: $HOME/askubuntu/
# DESC: For: https://askubuntu.com/questions/1188024/how-to-test-oom-killer-from-command-line
# DATE: November 12, 2019.
# PARM: Parameter 1 can be journalctl boot sequence, eg -b-2 for two boots ago.
#       Defaults to -b-0 (current boot).

BootNo="-b-0"
[[ $1 != "" ]] && BootNo="$1"

# Get time stamp if recorded with `logger` command:
journalctl "$BootNo" | grep 'Start for oom-killer' | tail -n1
# Print headings for last oom-killer
journalctl "$BootNo" | grep '\[ pid ]' -B10 | tail -n11
# Get lat oom_reaper entry's PID
PID=$(journalctl "$BootNo" | grep oom_reaper | tail -n1 | cut -d' ' -f9)
# Print pid information
journalctl "$BootNo" | grep "$PID"']' | tail -n1
# Print summary infomation
journalctl "$BootNo" | grep oom_reaper -B2 | tail -n3

Solution 5

If you just want to trigger oom-killer, just increase "$a" size exponentially, like so:

bash -c "for b in {0..99999999}; do a=$b$a; done"

If you want to monitor it live you just need to do a nested loop like:

for x in {1..200}; do echo "Round $x"; bash -c "for b in {0..99999999}; do a=$b$a; done"; done

There is no need to compile anything. Bash can do it on its own.

Expected Results:

kernel: Out of memory: Kill process 1439 (bash) score 777 or sacrifice child
kernel: Killed process 1439 (bash)

Note: Unfortunately I don't have score to post this as comment.

Share:
9,525
WinEunuuchs2Unix
Author by

WinEunuuchs2Unix

Software development is my main hobby. Check out the new websites created in October 2021: www.pippim.com and pippim.github.io

Updated on September 18, 2022

Comments

  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 1 year

    The OOM Killer or Out Of Memory Killer is a process that the linux kernel employs when the system is critically low on memory. ... This maximises the use of system memory by ensuring that the memory that is allocated to processes is being actively used.

    This self-answered question asks:

    • How to test oom-killer from the command line?

    A quicker method than the 1/2 hour it takes in the self-answer would be accepted.

    • Doug Smythies
      Doug Smythies over 4 years
      with credit to here, this looks as though it will work: yes | tr \\n x | grep n, but is going to take a long time, but less time than your solution (I think, I don't actually have time to let it finish).
    • Doug Smythies
      Doug Smythies over 4 years
      Gee, this one is really fast, relative to anything else I have tried: tail /dev/zero. credit here.
    • WinEunuuchs2Unix
      WinEunuuchs2Unix over 4 years
      @DougSmythies Oh yes tail /dev/zero is the best answer. As a bonus it's a good way of cleaning up RAM down to 830 MB used and 6.84 GB free. Please post as a new answer (your C program still warrants a good separate answer for the things it does I think).
    • Doug Smythies
      Doug Smythies over 4 years
      The preferred way to clean up RAM is (as sudo): sync followed by echo 3 > /proc/sys/vm/drop_caches. I'll post a new answer shortly, even though it is based on knowledge I got elsewhere.
    • WinEunuuchs2Unix
      WinEunuuchs2Unix over 4 years
      @DougSmythies That echo didn't work for someone before: askubuntu.com/questions/609226/…. BTW after tail /dev/zero not only are firefox tabs crashing but also new instances of chrome where all opened chrome windows & tabs were nuked.
    • WinEunuuchs2Unix
      WinEunuuchs2Unix over 4 years
      @DougSmythies Sorry I have to withdraw my complements on tail /dev/zero it causes all kinds of problems with web browsers and even shutdown failed afterwards.
    • Doug Smythies
      Doug Smythies over 4 years
  • Doug Smythies
    Doug Smythies over 4 years
    I think that 1/2 hour number you mentioned is specific to your computer. It seems it is going to take about 9 hours on mine.
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 4 years
    @DougSmythies It may not break if you have too much RAM (I have 8GB). It may not break if you don't pass a subdirectory level at the bottom level (25 levels deep in my case). To "quickly" find your bottom subdirectory level use: How to quickly find the deepest subdirectory.
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 4 years
    +1 but I was hoping for bash/shell interpreter commands people could run without compiling a binary first. I did find another .c program reference before asking my question here: stackoverflow.com/questions/1911741/c-use-up-almost-all-memo‌​ry and this one-liner: for (object[] o = null;; o = new[] { o }); amongst other answers that may interest you. I was thinking tonight another option would be dd if=/dev/zero and have the output file in RAM of=??? answer probably here: unix.stackexchange.com/questions/188536/…
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 4 years
    Your perl command perl -wE 'my @xs; for (1..2**20) { push @xs, q{a} x 2**20 }; say scalar @xs;' did generate oom-killer on my machine: Nov 11 23:03:19 alien kernel: Out of memory: Kill process 22635 (perl) score 554 or sacrifice child. Does that push command refer to a stack which is relieved with a pop command? If so you don't see that often anymore these days.
  • Eliah Kagan
    Eliah Kagan over 4 years
    @WinEunuuchs2Unix push @xs, q{a} x 2**20 operates on the array @xs. It adds the scalar q{a} x 2**20 (a string of ~1 million as) to the end. pop @xs would remove and return the last element. So when the preceding operation was a push, pop will undo it; Perl arrays can be used as stacks. But they support more than push and pop; in that way, they're not stacks. Perl arrays are similar to Python lists: xs.append(x) is like push @xs, $x and xs.pop() is like pop @xs.
  • Martin Bonner supports Monica
    Martin Bonner supports Monica over 4 years
    In the loop to initialize the memory so it is actually allocated, can't you speed things up with for (k=0, k < i; k+= 4096) {fptr[k] = 1;} ? The idea is to only write to one byte/cache-line in each page. (Or even better, use the actual page size by calling sysconf(_SC_PAGESIZE))
  • Doug Smythies
    Doug Smythies over 4 years
    @MartinBonnersupportsMonica : Great input, thanks. Actually, and for my own original use of testm.c, I wanted it to run slow enough that I could observe progress via watch -d free -m.
  • mjoao
    mjoao over 4 years
    Never use "tail /dev/zero". You'll end up with a tainted kernel and a frozen system!! Example: for x in {1..100}; do echo "Round $x"; tail /dev/zero; done
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 4 years
    I agree I had all kinds of problems after using tail /dev/zero. Even a shutdown failed.
  • Doug Smythies
    Doug Smythies over 4 years
    There hasn't been a /var/log/messages file in Ubuntu for many years now. Suggest one of the other methods referenced on other answers, or var/log/kern.log. But yes, does work fine.
  • Doug Smythies
    Doug Smythies over 4 years
    @mjoao : I do not end up with a tainted kernel using tail /dev/zero. For your solution, I do not get a oom exit, but rather it nuked my ssh session.
  • mjoao
    mjoao over 4 years
    @DougSmythies: If you don't want your ssh session nuked, wrap/launch it inside a bash -c, like: bash -c "a=""; for b in {0..99999999}; do a=$b$a$a; done" Regarding the tainted kernel, although I've seen it once, it's not easy to reproduce and I don't have enough time or will to do a kernel dump analysis.
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 4 years
    FYT the nested loop revision didn't work for me. The inner loop is a separate process that gets killed and the outer loop just starts a new process all over again.
  • zvi
    zvi over 3 years
    on python3 it's range instead of xrange
  • mchid
    mchid over 2 years
    This doesn't do anything.