CPU0 is swamped with eth1 interrupts

16,591

Solution 1

Look in the /proc/irq/283 directory. There is a smp_affinity_list file which shows which CPUs will get the 283 interrupt. For you this file probably contains "0" (and smp_affinity probably contains "1").

You can write the CPU range to the smp_affinity_list file:

echo 0-7 | sudo tee /proc/irq/283/smp_affinity_list

Or you can write a bitmask, where each bit corresponds to a CPU, to smp_affinity:

printf %x $((2**8-1)) | sudo tee /proc/irq/283/smp_affinity

However, irqbalance is known to have its own idea of what affinity each interrupt should have, and it might revert your updates. So it is best if you just uninstall irqbalance completely. Or at least stop it and disable it from coming up on reboot.

If even without irqbalance you are getting odd smp_affinity for interrupt 283 after a reboot, you will have to manually update the CPU affinity in one of your startup scripts.

Solution 2

If you have the right model of Intel NIC you can improve performance significantly.

To quote the first paragraph:

Multicore processors and the newest Ethernet adapters (including the 82575, 82576, 82598, and 82599) allow TCP forwarding flows to be optimized by assigning execution flows to individual cores. By default, Linux automatically assigns interrupts to processor cores. Two methods currently exist for automatically assigning the interrupts, an inkernel IRQ balancer and the IRQ balance daemon in user space. Both offer tradeoffs that might lower CPU usage but do not maximize the IP forwarding rates. Optimal throughput can be obtained by manually pinning the queues of the Ethernet adapter to specific processor cores.

For IP forwarding, a transmit/receive queue pair should use the same processor core and reduce any cache synchronization between different cores. This can be performed by assigning transmit and receive interrupts to specific cores. Starting with Linux kernel 2.6.27, multiple queues can be used on the 82575, 82576, 82598, and 82599. Additionally, multiple transmit queues were enabled in Extended Messaging Signaled Interrupts (MSI-X). MSI-X supports a larger number of interrupts that can be used, allowing for finer-grained control and targeting of the interrupts to specific CPUs.

See: Assigning Interrupts to Processor Cores using an Intel® 82575/82576 or 82598/82599 Ethernet Controller

Solution 3

Actually it is recommended, especially when dealing with repetitive processes over a short duration, that all interruptions generated by a device queue is handled by the same CPU, instead of IRQ balancing and thus you will see better performance if a single CPU handled the eth1 interrupt*** exception provided below

The source, linked above, is from the Linux Symposium and I do recommend you read through the couple paragraphs on SMP IRQ Affinity because it will convince you more effectively than this post.

Why?

Recall each processor has its own cache aside from being able to access main memory, check out this diagram. When an interrupt is triggered, a CPU core will have to fetch the instructions to handle the interrupt from main memory, which takes much longer than if the instructions where in the cache. Once a processor executed a task it will have those instructions in the cache. Now say the same CPU core handles the same interrupt almost all the time, the interrupt handler function will unlikely leave the CPU core cache, boosting the kernel performance.

Alternatively, when IRQ is balanced it can assign the interruption to be handled constantly by different CPU, then the new CPU core probably will not have the interrupt handler function in the cache, and a long time will be required to get the proper handler from main memory.

Exception: if you are seldom using eth1 interrupt, meaning enough time passes that the cache is overwritten by doing other tasks, meaning you have data coming over that interface intermittently with long periods in between...then you most likely will not see these benefits for they are when you use a process at high frequency.

Conclusion

If your interrupt occurs very frequently then just bind that interrupt to be handled by a specific CPU only. This configuration lives at

 /proc/'IRQ number'/smp_affinity

or

/proc/irq/'IRQ number'/smp_affinity

See the last paragraph in the SMP IRQ Affinity section from the source linked above, it has instructions.

Alternatively

You can change the frequency that the interrupt flag is raised by either increasing the MTU size (jumbo frames) if the network allows for it or change to have the flag raised after a larger amount of packets are received instead of at every packet OR change the time out, so raise interrupt after a certain amount of time. Caution with the time option because your buffer size might be full before time runs out. This can be done using the ethtool which is outlined in the linked source.

this answer is approaching the length at which people wont read it so I will not go into much detail, but depending on your situation there are many solutions... check the source :)

Share:
16,591

Related videos on Youtube

Alexander Gladysh
Author by

Alexander Gladysh

:-) mailto: [email protected]

Updated on September 18, 2022

Comments

  • Alexander Gladysh
    Alexander Gladysh almost 2 years

    I've got an Ubuntu VM, running inside Ubuntu-based Xen XCP. It hosts a custom FCGI-based HTTP service, behind nginx.

    Under load from ab the first CPU core is saturated, and the rest is under-loaded.

    In /proc/interrupts I see that CPU0 serves an order of magnitude more interrupts than any other core. Most of them come from eth1.

    Is there anything I can do to improve performance of this VM? Is there a way to balance interrupts more evenly?


    Gory details:

    $ uname -a
    Linux MYHOST 2.6.38-15-virtual #59-Ubuntu SMP Fri Apr 27 16:40:18 UTC 2012 i686 i686 i386 GNU/Linux
    
    $ lsb_release -a
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description:    Ubuntu 11.04
    Release:    11.04
    Codename:   natty
    
    $ cat /proc/interrupts 
               CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
    283:  113720624          0          0          0          0          0          0          0   xen-dyn-event     eth1
    284:          1          0          0          0          0          0          0          0   xen-dyn-event     eth0
    285:       2254          0          0    3873799          0          0          0          0   xen-dyn-event     blkif
    286:         23          0          0          0          0          0          0          0   xen-dyn-event     hvc_console
    287:        492         42          0          0          0          0          0     295324   xen-dyn-event     xenbus
    288:          0          0          0          0          0          0          0     222294  xen-percpu-ipi       callfuncsingle7
    289:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug7
    290:          0          0          0          0          0          0          0     151302  xen-percpu-ipi       callfunc7
    291:          0          0          0          0          0          0          0    3236015  xen-percpu-ipi       resched7
    292:          0          0          0          0          0          0          0      60064  xen-percpu-ipi       spinlock7
    293:          0          0          0          0          0          0          0   12355510  xen-percpu-virq      timer7
    294:          0          0          0          0          0          0     803174          0  xen-percpu-ipi       callfuncsingle6
    295:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug6
    296:          0          0          0          0          0          0      60027          0  xen-percpu-ipi       callfunc6
    297:          0          0          0          0          0          0    5374762          0  xen-percpu-ipi       resched6
    298:          0          0          0          0          0          0      64976          0  xen-percpu-ipi       spinlock6
    299:          0          0          0          0          0          0   15294870          0  xen-percpu-virq      timer6
    300:          0          0          0          0          0     264441          0          0  xen-percpu-ipi       callfuncsingle5
    301:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug5
    302:          0          0          0          0          0      79324          0          0  xen-percpu-ipi       callfunc5
    303:          0          0          0          0          0    3468144          0          0  xen-percpu-ipi       resched5
    304:          0          0          0          0          0      66269          0          0  xen-percpu-ipi       spinlock5
    305:          0          0          0          0          0   12778464          0          0  xen-percpu-virq      timer5
    306:          0          0          0          0     844591          0          0          0  xen-percpu-ipi       callfuncsingle4
    307:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug4
    308:          0          0          0          0      75293          0          0          0  xen-percpu-ipi       callfunc4
    309:          0          0          0          0    3482146          0          0          0  xen-percpu-ipi       resched4
    310:          0          0          0          0      79312          0          0          0  xen-percpu-ipi       spinlock4
    311:          0          0          0          0   21642424          0          0          0  xen-percpu-virq      timer4
    312:          0          0          0     449141          0          0          0          0  xen-percpu-ipi       callfuncsingle3
    313:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug3
    314:          0          0          0      95405          0          0          0          0  xen-percpu-ipi       callfunc3
    315:          0          0          0    3802992          0          0          0          0  xen-percpu-ipi       resched3
    316:          0          0          0      76607          0          0          0          0  xen-percpu-ipi       spinlock3
    317:          0          0          0   16439729          0          0          0          0  xen-percpu-virq      timer3
    318:          0          0     876383          0          0          0          0          0  xen-percpu-ipi       callfuncsingle2
    319:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug2
    320:          0          0      76416          0          0          0          0          0  xen-percpu-ipi       callfunc2
    321:          0          0    3422476          0          0          0          0          0  xen-percpu-ipi       resched2
    322:          0          0      69217          0          0          0          0          0  xen-percpu-ipi       spinlock2
    323:          0          0   10247182          0          0          0          0          0  xen-percpu-virq      timer2
    324:          0     393514          0          0          0          0          0          0  xen-percpu-ipi       callfuncsingle1
    325:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug1
    326:          0      95773          0          0          0          0          0          0  xen-percpu-ipi       callfunc1
    327:          0    3551629          0          0          0          0          0          0  xen-percpu-ipi       resched1
    328:          0      77823          0          0          0          0          0          0  xen-percpu-ipi       spinlock1
    329:          0   13784021          0          0          0          0          0          0  xen-percpu-virq      timer1
    330:     730435          0          0          0          0          0          0          0  xen-percpu-ipi       callfuncsingle0
    331:          0          0          0          0          0          0          0          0  xen-percpu-virq      debug0
    332:      39649          0          0          0          0          0          0          0  xen-percpu-ipi       callfunc0
    333:    3607120          0          0          0          0          0          0          0  xen-percpu-ipi       resched0
    334:     348740          0          0          0          0          0          0          0  xen-percpu-ipi       spinlock0
    335:   89912004          0          0          0          0          0          0          0  xen-percpu-virq      timer0
    NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
    LOC:          0          0          0          0          0          0          0          0   Local timer interrupts
    SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
    PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
    IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
    RES:    3607120    3551629    3422476    3802992    3482146    3468144    5374762    3236015   Rescheduling interrupts
    CAL:     770084     489287     952799     544546     919884     343765     863201     373596   Function call interrupts
    TLB:          0          0          0          0          0          0          0          0   TLB shootdowns
    TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
    THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
    MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
    MCP:          0          0          0          0          0          0          0          0   Machine check polls
    ERR:          0
    MIS:          0
    
    • Admin
      Admin over 11 years
      Bonus question: is there a way to lessen the number of interrupts from eth1?
    • Admin
      Admin over 2 years
      yes. You can change the frequency that the interrupt flag is raised by either increasing the MTU size (jumbo frames) if the network allows for it or change to have the flag raised after a larger amount of packets are received instead of at every packet OR change the time out, so raise interrupt after a certain amount of time. Caution with the time option because your buffer size might be full before time runs out. This can be done using the ethtool
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    irqbalance already is running. Maybe it is not configured correctly? How to check that?
  • chutz
    chutz over 11 years
    Maybe you should just disable irqbalance, reboot, see if that helps. The interrupts are pretty well balanced out by default.
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    FYI: /proc/irq/283/smp_affinity has 01 in it now (nobody changed that stuff on this machine to the best of my knowledge — so this must be system default).
  • chutz
    chutz over 11 years
    Sorry, I updated my answer. irqbalance is probably the culprit. Just get rid of it. I don't know what the default is supposed to be, but from experience I have seen it default to "ALL CPUs".
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    Disabling irqbalance (via ENABLED=0 in /etc/default/irqbalance) does not help. After reboot irqbalance is stop/waiting, but /proc/irq/283/smp_affinity is still 01.
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    echo $((2**8-1)) | sudo tee /proc/irq/283/smp_affinity gives me Input/output error (irqbalance is stopped).
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    BTW, I see /proc/irq/283/affinity_hint, but can't find proper docs for it. Can't this be related?
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    Nevermind that — I tried it on wrong machine with 4 cores instead of 8.
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    Erm. No, still can't change that. Can't figure out what is going on. Will ask a separate question for that.
  • Alexander Gladysh
    Alexander Gladysh over 11 years
  • chutz
    chutz over 11 years
    Ah, smp_affinity should be in hex. Or you can just write to smp_affinity_list - it is a more readable format (comma separated list of CPU ranges). E.g. 0,3,5-7
  • Alexander Gladysh
    Alexander Gladysh over 11 years
    I've got an impression that smp_affinity_list is not supported in 2.6. Also, I did try that in hex, actually... Anyway, moving this to a question I linked to — will try tomorrow.