Hi all,
I've seen in several implementations two different ways of Interrupt handling:(i) Using a loop that handles several IRQs until IAR gets the ID of a special/spurious IRQ.(ii) Handling one by one, and each IRQ performs an kernel/hypervisor exit.Q1: I would like to get your insight on what is the recommend way of a sw implementation running on ARMv8 with GICv3.
Today, we are using the option (ii) and we are getting some spurious IRQs, we actually dont know what is the root cause. We would like to know if the spurious IRQ is legitime or we are doing something wrong there.
Q2: In our setup, we use only SGIs and PPIs. Do you know a case where these two types of IRQs can trigger spurious IRQs?
I see in the spec the following:
Q3: For (1) I udnerstand, if a group0 IRQ has higher prioritiy and sw tries to acknowledge it using ICC_IAR1_EL1, is it?(1) is not our case, since we only deal with group 1 IRQs. What about (2): If the TZ secure world is using IRQs in the secure side, does this has any side effect on the non-secure side, even when the group0 IRQs are disabled?Q4: Do you guys suggest any good way do debug the root cause of spurious IRQs?Thanks,Jorge
Jorge said:Q1: I would like to get your insight on what is the recommend way of a sw implementation running on ARMv8 with GICv3. Today, we are using the option (ii) and we are getting some spurious IRQs, we actually dont know what is the root cause. We would like to know if the spurious IRQ is legitime or we are doing something wrong there.
Q1: I would like to get your insight on what is the recommend way of a sw implementation running on ARMv8 with GICv3.
I don't think there is a single recommended approach, both the approaches you listed work. In part it comes down to what you're interrupts look like - how often will you have multiple interrupts pending that you can consume by looping? If it's rare, re-reading ICC_IARx_EL1 probably won't win you much, and the extra instructions in the loop would just be overhead.
Jorge said:Q2: In our setup, we use only SGIs and PPIs. Do you know a case where these two types of IRQs can trigger spurious IRQs?
There are legitimate reasons why you might see spurious returned, but it's probably worth looking at as I'd expect it to be rare. Also, the reasons aren't really specific to SGIs or PPIs.
Examples:
But again, I'd expect these circumstances to be relatively rare in typical usage.
Jorge said:Q3: For (1) I udnerstand, if a group0 IRQ has higher prioritiy and sw tries to acknowledge it using ICC_IAR1_EL1, is it?
A G0 interrupt would generate an FIQ, not IRQ (assuming no in legacy mode). But otherwise - yes.
Another example could be that the highest priority pending interrupt (HPPI) belongs to the "other" world. For example, the HPPI is a S.G1 interrupt. You try to read ICC_IAR1_EL1 from Non-secure state - you'd get spurious.
The way the IRQ/FIQ signals are used in GICv3 (non-legacy) means that you typically
Jorge said:Q4: Do you guys suggest any good way do debug the root cause of spurious IRQs?
Some things I have done in the past:
On entry to the IRQ handler - before ICC_IARx - read the ISPEND and ISACTIVE registers. If ICC_IARx returns spurious, re-read the ISPEND and ISACTIVE registers, seeing if anything changed. This doesn't solve all the race conditions, but it can highlight some problems. (You'd only need to check the GICR registers, not GICD, given you're using PPIs and SGIs)
In EL1 (or whichever EL you're routing the interrupts to), set the PSTATE.I/F bits and then go into WFI. The core will wake on the IRQ/FIQ arriving, but won't take an exception due to the masks. Immediately after the WFI, read ISR_EL1 and ICC_HPPIRx_EL1, then ack the interrupt. Keep repeating this process until you see spurious.
With both the approaches above, what I'm interested in is which interrupts trigger an exception but then "go away" again. Is it always the same one? It only when I achieve a certain rate of interrupts?