Understanding OOM killer logs
The OOM killer decided to kill another process.
The message did state:
Kill process 20911 .... or sacrifice child
It decided to kill the child with pid 20977, a shell script that was spawned by the process.
If you want Linux to always kill the task which caused the out of memory condition, set the sysctl vm.oom_kill_allocating_task
to 1.
From the kernel documentation:
This enables or disables killing the OOM-triggering task in out-of-memory situations.
If this is set to zero, the OOM killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed.
If this is set to non-zero, the OOM killer simply kills the task that triggered the out-of-memory condition. This avoids the expensive tasklist scan.
If panic_on_oom is selected, it takes precedence over whatever value is used in oom_kill_allocating_task.
The default value is 0.
Related videos on Youtube
Comments
-
sergeyz almost 2 years
I run some processes inside docker container and I use memory limitation for this container. Sometimes some processes inside docker container got killed by OOM killer. I see that in syslog file:
beam.smp invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0 beam.smp cpuset=/ mems_allowed=0 CPU: 0 PID: 20908 Comm: beam.smp Not tainted 3.13.0-36-generic #63~precise1-Ubuntu Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014 ffff880192ca6c00 ffff880117ebfbe8 ffffffff817557fe 0000000000000007 ffff8800ea1e9800 ffff880117ebfc38 ffffffff8174b5b9 ffff880100000000 000000d08137dd08 ffff880117ebfc38 ffff88010c05e000 0000000000000000 Call Trace: [<ffffffff817557fe>] dump_stack+0x46/0x58 [<ffffffff8174b5b9>] dump_header+0x7e/0xbd [<ffffffff8174b64f>] oom_kill_process.part.5+0x57/0x2d4 [<ffffffff81075295>] ? has_ns_capability_noaudit+0x15/0x20 [<ffffffff8115b709>] ? oom_badness.part.4+0xa9/0x140 [<ffffffff8115ba27>] oom_kill_process+0x47/0x50 [<ffffffff811bee4c>] mem_cgroup_out_of_memory+0x28c/0x2b0 [<ffffffff811c122b>] mem_cgroup_oom_synchronize+0x23b/0x270 [<ffffffff811c0ac0>] ? memcg_charge_kmem+0xf0/0xf0 [<ffffffff8115be08>] pagefault_out_of_memory+0x18/0x90 [<ffffffff81747e91>] mm_fault_error+0xb9/0xd3 [<ffffffff81766267>] ? __do_page_fault+0x317/0x570 [<ffffffff81766495>] __do_page_fault+0x545/0x570 [<ffffffff8101361d>] ? __switch_to+0x16d/0x4d0 [<ffffffff810a5d3d>] ? set_next_entity+0xad/0xd0 [<ffffffff8175df1e>] ? __schedule+0x38e/0x700 [<ffffffff817664da>] do_page_fault+0x1a/0x70 [<ffffffff81762648>] page_fault+0x28/0x30 Task in /docker/a4d47fb7bbc8a2bbc172bd26085c4509364b1b7eec61439669e08e281b181a0b killed as a result of limit of /docker/a4d47fb7bbc8a2bbc172bd26085c4509364b1b7eec61439669e08e281b181a0b memory: usage 229600kB, limit 262144kB, failcnt 5148 memory+swap: usage 524288kB, limit 524288kB, failcnt 19118 kmem: usage 0kB, limit 18014398509481983kB, failcnt 0 Memory cgroup stats for /docker/a4d47fb7bbc8a2bbc172bd26085c4509364b1b7eec61439669e08e281b181a0b: cache:0KB rss:229600KB rss_huge:8192KB mapped_file:0KB writeback:3336KB swap:294688KB inactive_anon:114980KB active_anon:114620KB inactive_file:0KB active_file:0KB unevictable:0KB [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [ 9537] 0 9537 8740 712 21 1041 0 my_init [13097] 0 13097 48 3 3 16 0 runsvdir [13098] 0 13098 42 4 3 19 0 runsv [13100] 0 13100 42 4 3 38 0 runsv [13101] 0 13101 42 4 3 17 0 runsv [13102] 0 13102 42 4 3 4 0 runsv [13103] 0 13103 42 4 3 39 0 runsv [13104] 0 13104 4779 243 15 60 0 cron [13105] 0 13105 8591 601 22 1129 0 ruby [13107] 0 13107 20478 756 43 560 0 syslog-ng [13108] 0 13108 11991 642 28 1422 0 ruby [20826] 0 20826 4467 249 14 63 0 run [20827] 0 20827 1101 144 8 29 0 huobi [20878] 0 20878 3708 172 13 48 0 run_erl [20879] 0 20879 249481 57945 321 72955 0 beam.smp [20969] 0 20969 1846 83 9 27 0 inet_gethost [20970] 0 20970 3431 173 12 33 0 inet_gethost [20977] 0 20977 1101 127 8 25 0 sh [20978] 0 20978 1074 125 8 23 0 memsup [20979] 0 20979 1074 68 7 23 0 cpu_sup [ 5446] 0 5446 8462 217 22 81 0 cron [ 5451] 0 5451 1101 127 8 26 0 sh [ 5453] 0 5453 1078 68 8 22 0 sleep [10898] 0 10898 8462 217 22 81 0 cron [10899] 0 10899 8462 216 22 80 0 cron [10900] 0 10900 1101 127 7 26 0 sh [10901] 0 10901 1101 127 8 25 0 sh [10902] 0 10902 1078 68 7 22 0 sleep [10903] 0 10903 1078 68 8 22 0 sleep Memory cgroup out of memory: Kill process 20911 (beam.smp) score 1001 or sacrifice child Killed process 20977 (sh) total-vm:4404kB, anon-rss:0kB, file-rss:508kB
I know that
beam.smp
process consumes memory resources very aggressively. So the very first line of logbeam.smp invoked oom-killer
does make sense.But I'm confused about last 2 lines of log. It says
Kill process 20911 (beam.smp)
, but processs with PID 20911 does not exist inside this cgroup (list of processes dumped to log too). And last line saysKilled process 20977 (sh)
(and this PID presents in cgroup). We were about to killbeam.smp
, but finally killedsh
. What does it mean?