Linux Interrupts Issue

5,877

Solution 1

The highest count of interrupts you have still averages to:

1872714173 interrupts / 83014987.85 seconds = 23 intr/s

which is not fearsome at all. As they are, these statistics are perfectly acceptable. A peak of 7500 intr/s is also acceptable on a busy system.

Whatever have led you to the conclusion that interrupts are a relevant metric, I would take a step back and reconsider. These are more often an effect of a problem (completely different problem) than a sole cause of problem. Only situation that comes to mind would be some rogue device on a bus.

If you have sar reports, look for some other metric that peaks at the time of poor performance (run queue? paging? disk I/O?) and re-start your analysis from there.

Solution 2

Interrupt issue is one of the causes of high system CPU usage, if you don’t seem high %irq values in mpstat output, it should be fine.

If you concern that the interrupts are distributed unevenly among CPUs, you need to enable irqbalance daemon or tune it manually by /proc/irq/*/smp_affinity

More on: http://honglus.blogspot.com/2010/01/troubleshooting-high-system-cpu-usage.html http://honglus.blogspot.com/2011/03/tune-interrupt-and-process-cpu-affinity.html

Share:
5,877

Related videos on Youtube

user739866
Author by

user739866

Updated on September 18, 2022

Comments

  • user739866
    user739866 almost 2 years

    Is there a simple way to determine if interrupts are a performance issue? I have the following from cat /proc/interrupts but really don't have a history of this server so I don't know if this could be causing any issues. I found the definition of each column at http://www.centos.org/docs/5/html/5.1/Deployment_Guide/s2-proc-interrupts.html but don't seem to find any guidelines on whether or not the results are acceptable.

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
     0: 1408788887          0          0          0          0          0          0          0    IO-APIC-edge  timer
     1:          3          0          0          0          0          0          0          0    IO-APIC-edge  i8042
     8:          1          0          0          0          0          0          0          0    IO-APIC-edge  rtc
     9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
    12:          4          0          0          0          0          0          0          0    IO-APIC-edge  i8042
    14:        476   92736034  560949599   89233642          0          0          0          0    IO-APIC-edge  ide0
    66:         81          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb4
    74:        153   62468419  147960075   25257462          0          0          0          0   IO-APIC-level  uhci_hcd:usb3, uhci_hcd:usb5
    82: 1054378386          0          0          0          0          0          0          0         PCI-MSI  eth0
    169:       8343 1516025027  954152248    6501060          0  757271678 1872714173    2565826   IO-APIC-level  megasas
    NMI:   28336831   18526902   35866900   13915052   25165724   26928152   21827791   19303613
    LOC: 1408788527 1408756844 1408788059 1408788084 1408788124 1408787843 1408787972 1408787711
    ERR:          0
    MIS:          0
    
    • FooBee
      FooBee over 12 years
      This numbers are meaningless without at least the uptime of the machine. It's a difference if something happens 100 times in 10 seconds or 10 hours. Also, on a more fundamental level: Do you think you have a performance problem with this server?
    • user739866
      user739866 over 12 years
      According to the SAR, intr/s is jumping from around 1100 to over 7500 about the time the performance degrades.