Why does my server freeze everyday at the same time?

11,530

Solution 1

A solution was reported here: http://www.hskupin.info/2010/06/17/how-to-fix-the-oom-killer-crashe-under-linux/

So what’s happened? The reason can be explained shortly: The Linux kernel likes to always allocate memory if applications asking for it. Per default it doesn’t really check if there is enough memory available. Given that behavior applications can allocate more memory as really is available. At some point it can definitely cause an out of memory situation. As result the OOM killer will be invoked and will kill that process:

Jun 11 11:35:21 vsrv03 kernel: [378878.356858] php-cgi invoked oom-killer: gfp_mask=0x1280d2, order=0, oomkilladj=0
Jun 11 11:36:11 vsrv03 kernel: [378878.356880] Pid: 8490, comm: php-cgi Not tainted 2.6.26-2-xen-amd64 #1

The downside of this action is that all other running processes are also affected. As result the complete VM didn’t work and needed a restart.

To fix this problem the behavior of the kernel has to be changed, so it will no longer overcommit the memory for application requests. Finally I have included those mentioned values into the /etc/sysctl.conf file, so they get automatically applied on start-up:

vm.overcommit_memory = 2
vm.overcommit_ratio = 80

(Reboot to apply changes.)

More about overcommit: http://www.win.tue.nl/~aeb/linux/lk/lk-9.html#ss9.6

Solution 2

What happens is simple, the why is not (need more information).

php5-cgi begins to use a LOT of memory at that time (may be a memory leak or a side effect), so much so that the system will run out of memory. So the kernel kills it (oom-killer is the kernel's out-of-memory killer) to maintain system stability.

This looks like a VPS -- is it? What kind? OOM errors are usually rare on physical machines with sufficient (1 GB+) RAM and swap space (2x RAM at least).

Share:
11,530
Zbyněk Nedoma
Author by

Zbyněk Nedoma

Updated on September 18, 2022

Comments

  • Zbyněk Nedoma
    Zbyněk Nedoma over 1 year

    My server freezes every day at 00:50 am. I do not know what it can be. In log files I found just such a suspect listing, but I do not know what that means. Could someone help me??

    kernel: php5-cgi invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0, oom_score_adj=0
    kernel: php5-cgi cpuset=/ mems_allowed=0
    kernel: Pid: 20316, comm: php5-cgi Not tainted 2.6.38.2-xxxx-std-ipv6-64 #2
    kernel: Call Trace:
    kernel: [<ffffffff810de9e8>] ? dump_header+0x88/0x1d0
    kernel: [<ffffffff810aa6e3>] ? ktime_get_ts+0xb3/0xe0
    kernel: [<ffffffff810de931>] ? oom_unkillable_task+0x91/0xc0
    kernel: [<ffffffff8150c7e5>] ? ___ratelimit+0xa5/0x120
    kernel: [<ffffffff810def0c>] ? oom_kill_process+0x8c/0x2e0
    kernel: [<ffffffff810dedd3>] ? select_bad_process+0x93/0x140
    kernel: [<ffffffff810df398>] ? out_of_memory+0x238/0x3e0
    kernel: [<ffffffff810e45bd>] ? __alloc_pages_nodemask+0x86d/0x8a0
    kernel: [<ffffffff8110ff4a>] ? alloc_pages_current+0xaa/0x120
    kernel: [<ffffffff81069d46>] ? pte_alloc_one+0x16/0x40
    kernel: [<ffffffff810fa159>] ? __pte_alloc+0x29/0xd0
    kernel: [<ffffffff810fa363>] ? handle_mm_fault+0x163/0x200
    kernel: [<ffffffff81066077>] ? do_page_fault+0x197/0x410
    kernel: [<ffffffff81100556>] ? do_brk+0x286/0x390
    kernel: [<ffffffff81a7419f>] ? page_fault+0x1f/0x30
    kernel: Mem-Info:
    kernel: Node 0 DMA per-cpu:
    kernel: CPU    0: hi:    0, btch:   1 usd:   0
    kernel: Node 0 DMA32 per-cpu:
    kernel: CPU    0: hi:  186, btch:  31 usd: 156
    kernel: active_anon:468626 inactive_anon:383 isolated_anon:0
    kernel: active_file:66 inactive_file:101 isolated_file:64
    kernel: unevictable:0 dirty:0 writeback:0 unstable:0
    kernel: free:3426 slab_reclaimable:1691 slab_unreclaimable:13557
    kernel: mapped:380 shmem:404 pagetables:10150 bounce:0
    kernel: Node 0 DMA free:7932kB min:44kB low:52kB high:64kB active_anon:7056kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:388kB kernel_stack:16kB pagetables:488kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no 
     kernel: lowmem_reserve[]: 0 1967 1967 1967
    kernel: Node 0 DMA32 free:5772kB min:5648kB low:7060kB high:8472kB active_anon:1867448kB inactive_anon:1532kB active_file:264kB inactive_file:404kB unevictable:0kB isolated(anon):0kB isolated(file):256kB present:2014316kB mlocked:0kB dirty:0kB writeback:0kB mapped:1520kB shmem:1616kB slab_reclaimable:6764kB slab_unreclaimable:53840kB kernel_stack:1912kB pagetables:40112kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1423 all_unreclaimable? no
    kernel: lowmem_reserve[]: 0 0 0 0
    kernel: Node 0 DMA: 45*4kB 56*8kB 29*16kB 8*32kB 7*64kB 4*128kB 2*256kB 0*512kB 1*1024kB 0*2048kB 1*4096kB = 7940kB
    kernel: Node 0 DMA32: 149*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 1*4096kB = 5772kB
    kernel: 646 total pagecache pages
    kernel: 0 pages in swap cache
    kernel: Swap cache stats: add 0, delete 0, find 0/0
    kernel: Free swap  = 0kB
    kernel: Total swap = 0kB
    kernel: 515824 pages RAM
    kernel: 12393 pages reserved
    kernel: 310091 pages shared
    kernel: 436769 pages non-shared
    kernel: [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
    kernel: [ 1562]     0  1562     4238       40   0       0             0 upstart-udev-br
    kernel: [ 1564]     0  1564     4230       74   0     -17         -1000 udevd
    kernel: [ 1656]     0  1656     4229       75   0     -17         -1000 udevd
    kernel: [ 1661]     0  1661     4229       73   0     -17         -1000 udevd
    kernel: [ 2620]     0  2620    12328      141   0     -17         -1000 sshd
    kernel: [ 2631]   101  2631    13713      221   0       0             0 rsyslogd
    kernel: [ 2648]     0  2648     1532       30   0       0             0 getty
    kernel: [ 2652]     0  2652     1532       29   0       0             0 getty
    kernel: [ 2655]     0  2655     1532       29   0       0             0 getty
    kernel: [ 2656]     0  2656     1532       29   0       0             0 getty
    kernel: [ 2659]     0  2659     1532       30   0       0             0 getty
    kernel: [ 2666]     0  2666     5281       65   0       0             0 cron
    kernel: [ 2701]   102  2701    32660     5339   0       0             0 named
    kernel: [ 2708]   104  2708    63627    11103   0       0             0 mysqld
    kernel: [ 2723]     0  2723     3482       38   0       0             0 couriertcpd
    kernel: [ 2725]     0  2725      980       17   0       0             0 courierlogger
    kernel: [ 2735]     0  2735     3482       38   0       0             0 couriertcpd
    kernel: [ 2737]     0  2737      980       17   0       0             0 courierlogger
    kernel: [ 2744]     0  2744     3482       41   0       0             0 couriertcpd
    kernel: [ 2747]     0  2747     1013       26   0       0             0 courierlogger
    kernel: [ 2754]     0  2754     3482       38   0       0             0 couriertcpd
    kernel: [ 2757]     0  2757      980       18   0       0             0 courierlogger
    kernel: [ 3313] 65534  3313    15729       79   0       0             0 memcached
    kernel: [ 3383]  1002  3383    12937     1590   0       0             0 sw-cp-serverd
    kernel: [ 3393]     0  3393     4894       58   0       0             0 xinetd
    kernel: [ 3535]  2522  3535     1027       28   0       0             0 qmail-send
    kernel: [ 3536]  2022  3536     1015       26   0       0             0 splogger
    kernel: [ 3537]     0  3537     1025       33   0       0             0 qmail-lspawn
    kernel: [ 3538]  2521  3538     1025       17   0       0             0 qmail-rspawn
    kernel: [ 3539]  2520  3539     1014       22   0       0             0 qmail-clean
    kernel: [ 3621]     0  3621    68797     3424   0       0             0 apache2
    kernel: [ 3622]     0  3622    40565     1745   0       0             0 apache2
    kernel: [ 3922]   106  3922    40087    38746   0       0             0 drwebd.real
    kernel: [ 3985]     0  3985     3163       37   0       0             0 mdadm
    kernel: [ 4024]     0  4024     1532       30   0       0             0 getty
    kernel: [24625]     0 24625    28299    11913   0       0             0 spamd
    kernel: [24626]   110 24626    28299    11912   0       0             0 spamd
    kernel: [24628]   110 24628    28299    11912   0       0             0 spamd
    kernel: [12008]    33 12008    68960     3226   0       0             0 apache2
    kernel: [12016]    33 12016    68946     3232   0       0             0 apache2
    kernel: [12568]    33 12568    68952     3229   0       0             0 apache2
    kernel: [13362]    33 13362    68933     3220   0       0             0 apache2
    kernel: [16894]    33 16894    68946     3204   0       0             0 apache2
    kernel: [16895]    33 16895    68902     3189   0       0             0 apache2
    kernel: [18991]   106 18991    40087    38745   0       0             0 drwebd.real
    kernel: [18992]   106 18992    40087    38745   0       0             0 drwebd.real
    kernel: [18993]   106 18993    40087    38745   0       0             0 drwebd.real
    kernel: [18994]   106 18994    40087    38745   0       0             0 drwebd.real
    kernel: [19165]    33 19165    68995     3216   0       0             0 apache2
    kernel: [19178]    33 19178    68947     3225   0       0             0 apache2
    kernel: [19918]    33 19918    68961     3218   0       0             0 apache2
    

    If you need more information, write me.

    Domaneni

    • Mohammed Mounir
      Mohammed Mounir almost 12 years
      When the server freezes you can ping the machine and SSH into it remotely?
    • nanofarad
      nanofarad almost 12 years
      Can you set up a cronjob to dump ps -A|grep "php5" at 0:49:59? Also, does PHP do anything, to your knowledge, at that time? You could consider migrating to ServerFault.se if few answers arrive here.
    • Zbyněk Nedoma
      Zbyněk Nedoma almost 12 years
      @CeltaWeb: I can ping server, but ssh don't function
  • Zbyněk Nedoma
    Zbyněk Nedoma almost 12 years
    It is dedicate server Kimsufi 2G: kimsufi.co.uk. RAM is 2GB and Processor is Intel Atom 1.20+ GHz.
  • ish
    ish almost 12 years
    @ZbyněkNedoma: thank you, I will look at your log in detail in a few hours and try to figure more out. What is php5-cgi doing at or just before 00:50? Something is triggering it...
  • Zbyněk Nedoma
    Zbyněk Nedoma almost 12 years
    In syslog I found this: drwebd.real: Loading /var/drweb/bases/drwnasty.vdb - Ok, virus records: 28348 drwebd.real: Total virus records: 3473166 drwebd.real: Key file: /opt/drweb/drweb32.key - Key file was not found! (No such file or directory) drwebd.real: A path to a valid license key file was not specified. drwebd.real: Daemon is enabled for protecting 0 e-mails: drwebd.real: Daemon is installed, active interfaces: /var/drweb/run/.daemon 127.0.0.1:3000 CRON[20081]: (root) CMD (/usr/local/rtm/bin/rtm 8 > /dev/null 2> /dev/null) ..... Repeat this about 20 times
  • Zbyněk Nedoma
    Zbyněk Nedoma almost 12 years
    So these two lines will solve the problem??
  • macrobook
    macrobook almost 12 years
    I haven't tested it, I was hoping you would and let us know. :)
  • Zbyněk Nedoma
    Zbyněk Nedoma almost 12 years
    I tried it and so far the server without falling. Thanks