How does the Linux kernel handle shared IRQs?

linux kernel pci interrupt irq

24,873

Solution 1

This is covered in chapter 10 of Linux Device Drivers, 3rd edition, by Corbet et al. It is available for free online, or you may toss some shekels O'Reilly's way for dead tree or ebook forms. The part relevant to your question begins on page 278 in the first link.

For what it's worth, here is my attempt to paraphrase those three pages, plus other bits I've Googled up:

When you register a shared IRQ handler, the kernel checks that either:

a. no other handler exists for that interrupt, or

b. all of those previously registered also requested interrupt sharing

If either case applies, it then checks that your dev_id parameter is unique, so that the kernel can differentiate the multiple handlers, e.g. during handler removal.
When a PCI¹ hardware device raises the IRQ line, the kernel's low-level interrupt handler is called, and it in turn calls all of the registered interrupt handlers, passing each back the dev_id you used to register the handler via request_irq().

The dev_id value needs to be machine-unique. The common way to do that is to pass a pointer to the per-device struct your driver uses to manage that device. Since this pointer must be within your driver's memory space for it to be useful to the driver, it is ipso facto unique to that driver.²

If there are multiple drivers registered for a given interrupt, they will all be called when any of the devices raises that shared interrupt line. If it wasn't your driver's device that did this, your driver's interrupt handler will be passed a dev_id value that doesn't belong to it. Your driver's interrupt handler must immediately return when this happens.

Another case is that your driver is managing multiple devices. The driver's interrupt handler will get one of the dev_id values known to the driver. Your code is supposed to poll each device to find out which one raised the interrupt.

The example Corbet et al. give is that of a PC parallel port. When it asserts the interrupt line, it also sets the top bit in its first device register. (That is, inb(0x378) & 0x80 == true, assuming standard I/O port numbering.) When your handler detects this, it is supposed to do its work, then clear the IRQ by writing the value read from the I/O port back to the port with the top bit cleared.

I don't see any reason that particular mechanism is special. A different hardware device could choose a different mechanism. The only important thing is that for a device to allow shared interrupts, it has to have some way for the driver to read the interrupt status of the device, and some way to clear the interrupt. You'll have to read your device's datasheet or programming manual to find out what mechanism your particular device uses.
When your interrupt handler tells the kernel it handled the interrupt, that doesn't stop the kernel from continuing to call any other handlers registered for that same interrupt. This is unavoidable if you are to share an interrupt line when using level-triggered interrupts.

Imagine two devices assert the same interrupt line at the same time. (Or at least, so close in time that the kernel doesn't have time to call an interrupt handler to clear the line and thereby see the second assertion as separate.) The kernel must call all handlers for that interrupt line, to give each a chance to query its associated hardware to see if it needs attention. It is quite possible for two different drivers to successfully handle an interrupt within the same pass through the handler list for a given interrupt.

Because of this, it is imperative that your driver tell the device it is managing to clear its interrupt assertion sometime before the interrupt handler returns. It's not clear to me what happens otherwise. The continuously-asserted interrupt line will either result in the kernel continuously calling the shared interrupt handlers, or it will mask the kernel's ability to see new interrupts so the handlers are never called. Either way, disaster.

Footnotes:

I specified PCI above because all of the above assumes level-triggered interrupts, as used in the original PCI spec. ISA used edge-triggered interrupts, which made sharing tricky at best, and possible even then only when supported by the hardware. PCIe uses message-signalled interrupts; the interrupt message contains a unique value the kernel can use to avoid the round-robin guessing game required with PCI interrupt sharing. PCIe may eliminate the very need for interrupt sharing. (I don't know if it actually does, just that it has the potential to.)
Linux kernel drivers all share the same memory space, but an unrelated driver isn't supposed to be mucking around in another's memory space. Unless you pass that pointer around, you can be pretty sure another driver isn't going to come up with that same value accidentally on its own.

Solution 2

When a driver requests a shared IRQ, it passes a pointer to the kernel to a reference to a device specific structure within the driver's memory space.

According to LDD3:

Whenever two or more drivers are sharing an interrupt line and the hardware interrupts the processor on that line, the kernel invokes every handler registered for that interrupt, passing each its own dev_id.

Upon checking several drivers' IRQ handlers, it appears they probe the hardware itself in order to determine whether or not it should handle the interrupt or return IRQ_NONE.

Examples

UHCI-HCD Driver

  status = inw(uhci->io_addr + USBSTS);
  if (!(status & ~USBSTS_HCH))  /* shared interrupt, not mine */
    return IRQ_NONE;

In the code above, the driver is reading the USBSTS register to determine if there is an interrupt to service.

SDHCI Driver

  intmask = sdhci_readl(host, SDHCI_INT_STATUS);

  if (!intmask || intmask == 0xffffffff) {
    result = IRQ_NONE;
    goto out;
  }

Just as in the previous example, the driver is checking a status register, SDHCI_INT_STATUS to determine whether it needs to service an interrupt.

Ath5k Driver

  struct ath5k_softc *sc = dev_id;
  struct ath5k_hw *ah = sc->ah;
  enum ath5k_int status;
  unsigned int counter = 1000;

  if (unlikely(test_bit(ATH_STAT_INVALID, sc->status) ||
        !ath5k_hw_is_intr_pending(ah)))
    return IRQ_NONE;

Just one more example.

24,873

bsirang

Updated on September 18, 2022

Comments

bsirang over 1 year
According to what I've read so far, "when the kernel receives an interrupt, all the registered handlers are invoked."

I understand that the registered handlers for each IRQ can be viewed via /proc/interrupts, and I also understand that the registered handlers come from the drivers that have invoked request_irq passing in a callback roughly of the form:
```
irqreturn_t (*handler)(int, void *)
```
Based on what I know, each of these interrupt handler callbacks associated with the particular IRQ should be invoked, and it is up to the handler to determine whether the interrupt should indeed be handled by it. If the handler should not handle the particular interrupt it must return the kernel macro IRQ_NONE.

What I am having trouble understanding is, how each driver is expected to determine whether it should handle the interrupt or not. I suppose they can keep track internally if they're supposed to be expecting an interrupt. If so, I don't know how they'd be able to deal with the situation in which multiple drivers behind the same IRQ are expecting an interrupt.

The reason I'm trying to understand these details is because I'm messing with the kexec mechanism to re-execute the kernel in the middle of system operation while playing with the reset pins and various registers on a PCIe bridge as well as a downstream PCI device. And in doing so, after a reboot I'm either getting kernel panics, or other drivers complaining that they're receiving interrupts even though no operation was taking place.

How the handler decided that the interrupt should be handled by it is the mystery.

Edit: In case it's relevant, the CPU architecture in question is x86.
- Ciro Santilli Путлер Капут 六四事 over 8 years
  
  stackoverflow.com/questions/14371513/for-a-shared-interrupt-‌line-how-do-i-find-w‌hich-interrupt-handl‌er-to-usec
bsirang over 11 years

Like you mentioned, the interrupt handler may be passed a dev_id that it does not own. To me it seems there is a non-zero chance that a driver that does not own the dev_id structure may still mistake it as its own based on how it interprets the contents. If this is not the case then what mechanism would prevent this?
Warren Young over 11 years

You prevent it by making dev_id a pointer to something within your driver's memory space. Another driver could make up a dev_id value that happened to be confusable with a pointer to memory your driver owns, but that's not going to happen because everyone is playing by the rules. This is kernel-space, remember: self-discipline is assumed as a matter of course, unlike with user-space code, which may blithely assume that anything not forbidden is allowed.
bsirang over 11 years

According to chapter ten of LDD3: "Whenever two or more drivers are sharing an interrupt line and the hardware interrupts the processor on that line, the kernel invokes every handler registered for that interrupt, passing each its own dev_id" It seems like the previous understanding was incorrect regarding whether an interrupt handler may be passed in a dev_id that it does not own.
Warren Young over 11 years

That was a mis-read on my part. When I wrote that, I was conflating two concepts. I've edited my answer. The condition that requires your interrupt handler to return quickly is that it gets called due to an interrupt assertion by a device it is not managing. The value of dev_id doesn't help you determine whether this has happened. You have to ask the hardware, "You rang?"
bsirang over 11 years

Yes, now I need to figure out how what I'm tinkering with is actually causing other drivers to believe their devices "rang" after a restart of the kernel via kexec.
user3405291 almost 7 years

Your link's content is not available