Linux Interrupts Issue
Solution 1
The highest count of interrupts you have still averages to:
1872714173 interrupts / 83014987.85 seconds = 23 intr/s
which is not fearsome at all. As they are, these statistics are perfectly acceptable. A peak of 7500 intr/s is also acceptable on a busy system.
Whatever have led you to the conclusion that interrupts are a relevant metric, I would take a step back and reconsider. These are more often an effect of a problem (completely different problem) than a sole cause of problem. Only situation that comes to mind would be some rogue device on a bus.
If you have sar
reports, look for some other metric that peaks at the time of poor performance (run queue? paging? disk I/O?) and re-start your analysis from there.
Solution 2
Interrupt issue is one of the causes of high system CPU usage, if you don’t seem high %irq values in mpstat output, it should be fine.
If you concern that the interrupts are distributed unevenly among CPUs, you need to enable irqbalance daemon or tune it manually by /proc/irq/*/smp_affinity
More on: http://honglus.blogspot.com/2010/01/troubleshooting-high-system-cpu-usage.html http://honglus.blogspot.com/2011/03/tune-interrupt-and-process-cpu-affinity.html
Related videos on Youtube
user739866
Updated on September 18, 2022Comments
-
user739866 almost 2 years
Is there a simple way to determine if interrupts are a performance issue? I have the following from cat /proc/interrupts but really don't have a history of this server so I don't know if this could be causing any issues. I found the definition of each column at http://www.centos.org/docs/5/html/5.1/Deployment_Guide/s2-proc-interrupts.html but don't seem to find any guidelines on whether or not the results are acceptable.
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 1408788887 0 0 0 0 0 0 0 IO-APIC-edge timer 1: 3 0 0 0 0 0 0 0 IO-APIC-edge i8042 8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc 9: 0 0 0 0 0 0 0 0 IO-APIC-level acpi 12: 4 0 0 0 0 0 0 0 IO-APIC-edge i8042 14: 476 92736034 560949599 89233642 0 0 0 0 IO-APIC-edge ide0 66: 81 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb4 74: 153 62468419 147960075 25257462 0 0 0 0 IO-APIC-level uhci_hcd:usb3, uhci_hcd:usb5 82: 1054378386 0 0 0 0 0 0 0 PCI-MSI eth0 169: 8343 1516025027 954152248 6501060 0 757271678 1872714173 2565826 IO-APIC-level megasas NMI: 28336831 18526902 35866900 13915052 25165724 26928152 21827791 19303613 LOC: 1408788527 1408756844 1408788059 1408788084 1408788124 1408787843 1408787972 1408787711 ERR: 0 MIS: 0
-
FooBee over 12 yearsThis numbers are meaningless without at least the uptime of the machine. It's a difference if something happens 100 times in 10 seconds or 10 hours. Also, on a more fundamental level: Do you think you have a performance problem with this server?
-
user739866 over 12 yearsAccording to the SAR, intr/s is jumping from around 1100 to over 7500 about the time the performance degrades.
-