High Linux loads on low CPU/memory usage

67

Solution 1

Intuitively, I'd suspect a disk issue as the most direct cause, but that doesn't mean your disks are too slow. Your iowait % from iostat doesn't indicate that any user processes are spending a lot of time waiting for disk I/O. However, your CPU time on kswapd gives me cause for concern:

root       493  0.1  0.0      0     0 ?        S<    2010  94:48 [kswapd1]

The 242MB of swap you're using may not seem like a lot, but to hit that kind of CPU time on a system that's only been up for 42 days you've either got a lot of swap activity happening or it's taking forever to finish once it starts because of other disk contention. Whether this is the source of your problem or not, it's something I would definitely look into.

Can you run sar -W and post the swap statistics for your system?

Solution 2

The most common cause of high load is slow drives. Try running the following

sar -d 5 0

and looking at the %util field. If that number is over 70% for any of your drives, that drive will be slow in handling IO requests causing the high load.

Edit: It might run OK at 70%, but thats the point where you'll probably start to see performance degradation. The higher you go, the worse it'll get.

Solution 3

Lots of system processes in "S<" state. On my machine they're listed as being just in "S". From man ps: < high-priority (not nice to other users). Something looks really screwed up. Try updating your kernel if it can be done and reboot.

Solution 4

What kind of network connection your server has? I have seen loads sky-rocketing in situations where the connection to switch was supposed to be 100 Mbit/s full duplex, but for some reason was negotiated as 100 Mbit/s half duplex. After I forced the 100M-FD mode with ethtool, loads dropped below 1 and network transfer speeds returned to normal.

Share:
67

Related videos on Youtube

Basel
Author by

Basel

Updated on September 17, 2022

Comments

  • Basel
    Basel almost 2 years

    I have an XML file with following structure:

    <xml>
       <category cat_id='1234'>
          <matches matche_id='123456'>
             <odds/>
          </matches>
          <matches matche_id='123456'>
             <odds>
                   <bets><bets/>
             <odds/>
          </matches>
       </category>
    </xml>
    

    I use the following code to check if odds tag is existed (I mean <odds> not <odds/>):

    $xml_check = 'my_file.xml';
    $flag = 0 ;
    foreach($xml_check->category as $category)  
    {
        foreach($category->match as $match) 
        {
    
                if (property_exists($match, 'odds'))
                {
                    $flag = 1;
                }
    
         }
    }
    echo $flag ;
    

    It always echo (flag = 1) it is considering the <odds/> tag as true value for property_exists How can I solve this I just want to check the tag ?

    • alvosu
      alvosu over 13 years
      add "ps aux" output
    • SyRenity
      SyRenity over 13 years
      Output was added.
  • BillThor
    BillThor over 13 years
    Look at the queue size and service times. If these increase then you have a problem.
  • phemmer
    phemmer over 13 years
    Ya, these can help indicate problems, but I didnt mention them because I've seen them spike without any adverse affects.
  • jgoldschrafe
    jgoldschrafe over 13 years
    Using disk performance as a first stop on a root-cause analysis can be dangerous, though. One of the reasons that disk I/O is one of the last places I'll look for a problem, even though it symptomatically tends to be the cause of a lot of issues, is that there are a lot of funny things that can go wrong on a system that will cause it to start hammering the disk with things other than the intended workload, and trying to fix the issue by adding more or faster spindles may not always be the best approach.
  • poige
    poige over 13 years
    "where I should look for more diagnostic information?" -- try looking into logs, check dmesg output.
  • phemmer
    phemmer over 13 years
    This is normal for those processes.
  • SyRenity
    SyRenity over 13 years
    sar -W results added to question above.
  • SyRenity
    SyRenity over 13 years
    I'm running the version 2.6.18-194.el5, any sense in updating to latest available version?
  • SyRenity
    SyRenity over 13 years
    Also, I see an OOM kill that happened last week, not sure if this related.
  • poige
    poige over 13 years
    If I ain't mistaken there's kernel-2.6.18-238 in wild, so it certainly has sense all the way.
  • SyRenity
    SyRenity over 13 years
    Ethtool shows a connection of Speed: 1000Mb/s, Duplex: Full.
  • Janne Pikkarainen
    Janne Pikkarainen over 13 years
    OK, probably not a network issue, then. Next thing for you to check: number of context switches and interrupts. Do that with sar -w (for context switches) and sar -I SUM (for interrupts).
  • SyRenity
    SyRenity over 13 years
    I added the both outputs in description above.
  • Janne Pikkarainen
    Janne Pikkarainen over 13 years
    The context switches are on a quite high side; I usually see something between from hundreds to couple of thousands switches/s even on loaded servers. Maybe those Java processes are doing something odd?
  • SyRenity
    SyRenity over 13 years
    The Java apps are heavily multi-threaded, perhaps this the cause of the context-switching?
  • SyRenity
    SyRenity over 13 years
    The OOM kill has removed the offending process which takes most of memory, so why a reboot is required?
  • Giraffe
    Giraffe almost 11 years
    was it because you were using swap? See also serverfault.com/a/524818/27813