High System Load, Low CPU/RAM Utilization on Ubuntu 15.04

6,426

This ended up being what I believe to be a kernel bug. Upon updating to 4.0.0-040000-generic #201504121935 my CPU wait has been normal and system load under .10 in most cases unless something is happening on the hosted servers.

Anyway, I used the following link to help : http://ubuntuhandbook.org/index.php/2015/04/upgrade-to-linux-kernel-4-0-in-ubuntu/

and just to keep in compliance with the rules, I did the following as root and then rebooted the machine:

wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.0-vivid/linux-headers-4.0.0-040000_4.0.0-040000.201504121935_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.0-vivid/linux-image-4.0.0-040000-generic_4.0.0-040000.201504121935_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.0-vivid/linux-headers-4.0.0-040000-generic_4.0.0-040000.201504121935_amd64.deb
dpkg -i linux-headers-4.0.0*.deb linux-image-4.0.0*.deb
update-grub

As far as how I came to this - after reading through countless forums and newsgroups/mailing lists and getting nowhere (tried messing with BIOs, boot options, commit=60, disabling services, changing physical server location, etc.) I decided to either downgrade or update the kernel...being that 15.04 is new I updated. Still unsure the root cause as I haven't seen any other reports of this issue, my assumption is when I used rsync from my old 14.10 system a faulty driver was copied over or a faulty kernel file - why 4.0.0 fixes this is beyond me...but at least no more kworker writing every 5 seconds to kern.log and my harddrives.

Share:
6,426

Related videos on Youtube

eric
Author by

eric

Updated on September 18, 2022

Comments

  • eric
    eric over 1 year

    not really a system administrator here but really trying to just set up a server (a rented VDS, really) for some friends.

    I recently transferred basically game servers/MySQL/web sites over from one VPS to another - while there hasn't been any issues on the new one I keep seeing my system load spike and take up both processors; previous server system load averaged at about .3-.5. Previous server was on Ubuntu 14, I exported a list of packages I installed from there and apt-get installed them on the new server; I also rsync'd most of the files from the old server over as well (I'm thinking I copied over something bad that's messing with my kernel...)

    Anyway, here is the results of my uname -a:

     Linux ophq 3.19.0-18-generic #18-Ubuntu SMP Tue May 19 18:31:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
    

    And the results of landscape-sysinfo/logging in screen:

      Welcome to Ubuntu 15.04 (GNU/Linux 3.19.0-18-generic x86_64)
      System load:  2.13                Processes:           11
      Usage of /:   22.6% of 196.64GB   Users logged in:     1
      Memory usage: 32%                 IP address for eth0: 123.123.123.123
      Swap usage:   0%
    

    (currently one game server is in use hence the memory usage - I have to reduce how much RAM is allocated to Minecraft from the default values)

    Result of top: http://ericbarber.me/serverproblem/top.png

    To add to this - if I hit F and then hit S on 'Process Status' and resort the top lists I have 2 commands listed under 'D'... kworker/u30:0 and kworker/u30:1 which leads me to my kernel assumption...

    I'm totally stumped on why load average is so high - I had my users test on both MC and the CS:GO servers and they aren't experiencing any lag - I also tested the web servers and they're delivering pages extremely fast (in comparison to the old server.)

    I thought it may be an interrupt issue, so here's the results of cat /proc/interrupts:

    http://ericbarber.me/serverproblem/interrupts2.png

    Along with this, another question suggested running grep . -r /sys/firmware/acpi/interrupts/ and disabling any values above 0...although all my values are 0 unfortunately.

    same url as above serverproblem/interrupts.png

    I installed perf and did a quick 30 second report - but I don't understand this output too much:

    same url as above serverproblem/perf.png

    I'll omit CPU info, but it is an Intel Xeon CPU E5-2690, 2 cores, 2gb RAM, and I believe about 500gb harddrive. My apologies if this is a dumb question or has been asked before - I've been working on this for a few hours now and I'm running into dead-ends with Google past just starting over from scratch...which preferably I would like to avoid.

    Apologies on the links..new user limitations.

    Edit: To add, the results of mpstat:

    Linux 3.19.0-18-generic (ophq)  06/05/2015  _x86_64_    (2 CPU)
    
    02:10:35 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal       %guest  %gnice   %idle
    02:10:35 PM  all    7.28    0.00    1.72   47.13    0.00    0.09    0.53        0.00    0.00   43.24
    
  • eric
    eric almost 9 years
    Actually, after sitting on the iotop screen for a few minutes eventually kworker did bubble up to 99.99% IO on both CPU cores (I'm assuming both cores.. [kworker/u30:0] and [kworker/u31:0] are the culprit. I exited and ran another perf - saw your post, checked iotop and it was gone, haha.
  • eric
    eric almost 9 years
    Unfortunately my host moved to a new box and we still see the same issue - however, no performance issues over the weekend...it really seems like a 2 is just being appended to the actual load - but that doesn't make sense as kworker still sits at 99.99% occasionally. I think this may have something to do with the fact that originally I rsync'd from another server (excluding most of the OS specific directories however) - maybe a driver conflict somewhere along the way...