CPU 100% idle but still showing load average

6,572

Load average doesn't mean what you think it means. It's not about instant CPU usage, but rather how many processes are waiting to run. Usually that's because of lots of things wanting CPU, but not always. A common culprit is a process waiting for IO - disk or network.

Try running ps -e v and looking for process state flags.

state    The state is given by a sequence of characters, for example, "RWNA". The      first character indicates the run state of the process:
D    Marks a process in disk (or other short term, uninterruptible) wait.
I    Marks a process that is idle (sleeping for longer than about 20 seconds).  
L    Marks a process that is waiting to acquire a lock.
R    Marks a runnable process.
S    Marks a process that is sleeping for less than about 20 seconds.
T    Marks a stopped process.
W    Marks an idle interrupt thread.
Z    Marks a dead process (a "zombie").

This is from the ps manpage, so you an find more detail there - R and D processes are probably of particular interest.

Your top output contains:

Tasks: 534 total,   1 running, 533 sleeping,   0 stopped,   0 zombie

That 1 running process is the cause of your load average. Find it, and figure out what it's up to. (Edit: As mentioned in comments - that running process is probably top. So ignore that)

Share:
6,572

Related videos on Youtube

haroon_aut
Author by

haroon_aut

Updated on September 18, 2022

Comments

  • haroon_aut
    haroon_aut almost 2 years

    I have a Blade Server with CentOS 6.4.

    On idle state it shows a constant load average of more than 1. However I prepared another machine having the same hardware and CentOS version and its load average is staying around 0 when it is idle.

    The output of top is as follows:

    top - 10:23:04 up 156 days, 18:15,  1 user,  load average: 1.08, 1.35, 1.31
    Tasks: 534 total,   1 running, 533 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
    Mem:  65959040k total, 10021484k used, 55937556k free,   167092k buffers
    Swap: 32767992k total,    13884k used, 32754108k free,  7084024k cached
    
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    20951 root      20   0 15396 1608  952 R  0.3  0.0   0:01.52 top
        1 root      20   0 19352  684  472 S  0.0  0.0   0:01.64 init
        2 root      20   0     0    0    0 S  0.0  0.0   0:00.03 kthreadd
        3 root      RT   0     0    0    0 S  0.0  0.0   0:15.31 migration/0
        4 root      20   0     0    0    0 S  0.0  0.0   0:12.32 ksoftirqd/0
        5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
        6 root      RT   0     0    0    0 S  0.0  0.0   0:17.45 watchdog/0
        7 root      RT   0     0    0    0 S  0.0  0.0   0:16.26 migration/1
        8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/1
        9 root      20   0     0    0    0 S  0.0  0.0   0:18.51 ksoftirqd/1
    

    Which process is causing the system load average to be > 1 while being totally idle?

  • Ouki
    Ouki about 10 years
    While totally agree with your analysis, I am not sure it has to do with the running process shown on top. If you look carefully: it is the top command itself.
  • haroon_aut
    haroon_aut about 10 years
    Thank you for helping in identifying the issue. From ps -e v, I found that updatedb is getting stuck. On running it in verbose mode I found that it is getting stuck on a mount which is no more available. So I did lazy unmount and after that updatedb command started working fine. Now the load average is almost 0 when system is idle i.e. the issue is resolved.
  • haroon_aut
    haroon_aut about 10 years
    Yes, the running process at that time was top. ps -e v helped me in identifying the stuck process i.e. updatedb and its status was D.