Flush-0:n processes causing massive bottleneck
Solution 1
Your system is being overloaded with disk writing requests and your configuration "dirty ratio" is not optimal for your environment.
You can set two administrative parameters for virtual memory:
These are the dirty_background_ratio
and dirty_ratio
locatable in /proc/sys/vm/
These parameters represent a percentage of memory.
If you setting a low value for dirty_ratio
You can get more disk load but would reduce the consumption of RAM for dirty memory management.
The dirty_background_ratio
is the percentage minimal residual memory, which caused the stoppage of writing dirty data in the disk from the system.
Β
This means that you must find the best compromise between the dirty chunks dimension to write (flush process) and minimum memory where the system will be stop in the writing process.
Relationship for good performance could be:
dirty_ratio 90%
dirty_background_ratio 5%
an average ratio:
dirty_ratio 40~50%
dirty_background_ratio 10~20%
The causes of this imbalance in your system can be several, among the most common causes is an insufficient amount of RAM to manage the installed other times it may simply be due to a drop in performance of memory installed on your server with causes ranging from poor ventilation to incorrect feeding.
Although most of the problems are in the form of software bugs, not known to many of these errors are due to poor confuguracion of the hardware in relation to the services installed. Especially in the case of rented machines.
To help those less familiar with Linux machines, the above mentioned parameters can be replaced in this way:
Permanent mode:
(run these two commands only once, otherwise edit this file with your favorite editor)
# echo "vm.dirty_ratio = 40" >> /etc/sysctl.conf
# echo "vm.dirty_background_ratio = 10" >> /etc/sysctl.conf
Temporally mode:
# echo "40" > /proc/sys/vm/dirty_ratio
# echo "10" > /proc/sys/vm/dirty_background_ratio
You can find more information about these settings at this link
Solution 2
I found following link with similar discussion:
0005972: Top and uptime displays wrong load average value - CentOS Bug Tracker
at last post it says:
The high load average issue is resolved in a newer version of the hpvsa driver (1.2.4-7) that is now released by HP. Contact HP Support to obtain a copy of the new driver.
Related videos on Youtube
Tom
Updated on September 18, 2022Comments
-
Tom over 1 year
I have a LAMP cluster that shares files via NFS and occasionally one of them will be stricken for a while when mysterious flush processes start appearing.
Can anyone help me? The only way to resolve this is to reboot - killing the processes only spawns new ones.
top - 19:43:43 up 104 days, 4:52, 1 user, load average: 27.15, 56.72, 33.31 Tasks: 301 total, 9 running, 292 sleeping, 0 stopped, 0 zombie Cpu(s): 15.6%us, 77.0%sy, 0.0%ni, 4.2%id, 2.0%wa, 0.0%hi, 1.2%si, 0.0%st Mem: 8049708k total, 7060492k used, 989216k free, 157156k buffers Swap: 4194296k total, 483228k used, 3711068k free, 928768k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 840 root 20 0 0 0 0 R 98.0 0.0 6:45.83 flush-0:24 843 root 20 0 0 0 0 R 97.6 0.0 5:50.32 flush-0:25 835 root 20 0 0 0 0 R 96.0 0.0 6:42.44 flush-0:22 836 root 20 0 0 0 0 R 95.0 0.0 6:51.56 flush-0:27 833 root 20 0 0 0 0 R 94.3 0.0 6:27.21 flush-0:23 841 root 20 0 0 0 0 R 93.7 0.0 6:46.97 flush-0:26 2305 apache 20 0 772m 31m 25m S 23.6 0.4 0:07.60 httpd 2298 apache 20 0 772m 31m 25m S 13.6 0.4 0:08.98 httpd 26771 apache 20 0 775m 47m 41m S 10.3 0.6 4:07.97 httpd 2315 apache 20 0 770m 29m 25m S 9.0 0.4 0:07.44 httpd 24370 memcache 20 0 457m 123m 608 S 8.6 1.6 66:20.28 memcached 1191 apache 20 0 770m 30m 26m S 8.3 0.4 0:13.54 httpd 2253 apache 20 0 771m 32m 27m S 8.3 0.4 0:11.75 httpd 3476 varnish 20 0 52.9g 2.0g 20m S 8.0 25.6 0:15.30 varnishd 17234 apache 20 0 775m 50m 45m S 7.0 0.6 9:22.09 httpd 23161 apache 20 0 780m 54m 43m S 7.0 0.7 6:33.40 httpd
Thanks
-
ramruma over 11 yearsIs it the NFS server or client that has this problem? Which filesystem are you using? Which version of Centos and which kernel?
-
R. S. over 11 yearsIs there anything in
dmesg
or/var/log/messages
that hint towards anything? -
John Siu over 11 yearsDoes the problem always start at the same time? Does it has a pattern?
-
-
fuero over 11 yearsYour suggestion is a long shot, OP never hinted that he or she's on HP hardware. I'd rather refer to this: serverfault.com/questions/341123/β¦