How is runq-sz counted in sar?

9,899

Solution 1

This man page has a more detailed explanation of this property:

runq-sz

The number of kernel threads in memory that are waiting for a CPU to run. Typically, this value should be less than 2. Consistently higher values mean that the system might be CPU-bound.

Interpreting results

As is the case with many "indicators" you have to use them in combination with one another to interpret if there's a performance issue or not. This particular indicator indicates if your system is starved for CPU time.

Whereas the load1,5,15 indicate processes that are in the run queue, but are being forced to wait for time to run. The load1,5,15 variety tells you the general trend of the system and if it's got a lot of processes waiting (ramping up load) vs. trending down. But processes can wait for a variety of things with load1,5,15, typically it's I/O that's blocking when you see high load1,5,15 times.

With runq-sz, you're waiting for time on a CPU.

References

Solution 2

This post is the first that turns up in Google and the last answer above is ticked as being accepted.

The answer provides a reference of, and quotes the text of, a Solaris manpage. The OP's question was, however, regarding RHEL 7. The treatment of runnable process reporting in Solaris and Linux is different.

  • Solaris tends to using load average / queue as an indicator of how many processes are waiting to run.

  • Linux tends to using load average / queue as an indicator of how many processes are running + how many processes are waiting to run.

  • The Linux representation of runq-sz in sar -q is more likely to indicate the number of current running processes + the number of queued processes.

To reference the OPs original example of an 8 thread instance, a runq-sz of less than 8 indicates optimal performance in this regard.

I would agree that the low runq-sz versus high loadavg probably indicates some sort of blocked or sleeping processes. You can partially see that in the OP's example sar output, in the blocked column.

Share:
9,899

Related videos on Youtube

Yu Watanabe
Author by

Yu Watanabe

I work as a data engineer. My job is to build operation ready data platform. Usually, I use Elastic stack. I always appreciate the help from this website whenever I have a problem. I hope that I can contribute my experience in some way. Please feel free to contact me if you have anything.

Updated on September 18, 2022

Comments

  • Yu Watanabe
    Yu Watanabe over 1 year

    I would like to ask question about the output from sar -q . I appreciate if someone can help me out with understanding runq-sz.

    I have a system which cpu threads are 8 cpu threads on RHEL 7.2 .

    [ywatanabe@host2 ~]$ cat /proc/cpuinfo | grep processor | wc -l
    8
    

    Below is sar -q result from my system but runq-sz seems to be low compared to ldavg-1 .

                    runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
    05:10:01 PM         0       361      0.29      1.68      2.14         0
    05:11:01 PM         0       363      1.18      1.61      2.08         2
    05:12:01 PM         0       363      7.03      3.15      2.58         1
    05:13:01 PM         0       365      8.12      4.15      2.96         1
    05:14:01 PM         3       371      7.40      4.64      3.20         1
    05:15:01 PM         2       370      7.57      5.26      3.51         1
    05:16:01 PM         0       366      8.42      5.90      3.84         1
    05:17:01 PM         0       365      8.78      6.45      4.16         1
    05:18:01 PM         0       363      7.05      6.40      4.28         2
    05:19:02 PM         1       364      8.05      6.74      4.53         0
    05:20:01 PM         0       367      7.96      6.96      4.74         1
    05:21:01 PM         0       367      7.86      7.11      4.93         1
    05:22:01 PM         1       366      7.84      7.31      5.14         0
    

    From the man sar , I was thinking that runq-sz represents the number of tasks inside the run queue which states are TASK_RUNNING which corresponds to R sate in ps .

              runq-sz
                     Run queue length (number of tasks waiting for run time).
    

    What does runq-sz actually represent ?