My initial understanding was that with LRENPIE set, any EOI coming from the guest on IRQs that are not in any LRs, will result in a maintenance interrupt and the hypervisor can check which IRQ was deactivated.
Upon taking a closer look though, I can't understand how this can be done. Since such an event will cause an asynchronous exception and not a data abort (like a regular trap), you cannot use the ESR to get the register number that the guest used for the EOI/DIR. The manual only mentions EOIcount, but that's just a counter of how many EOIs occured without a relevant LR.
So my question is, how can you tell which IRQ the guest EOIed, by using the LRENPIE maintenance interrupt and without trapping all EOI guest accesses?
Thanks for the answer Martin, so just to clarify, let's take the following example.
Let's assume that the guest manually sets to active a whole line of IRQs (32 in total via GICD_ISACTIVER) with the same priority. Best case scenario we have 16 free LRs and we place half of them there, while the other half are tracked in memory. If the guest EOIs one of the 16 IRQs in the LRs that's perfectly fine. But what if it EOIs one of the others? Then to my understanding LRNP maintenance IRQ will trigger and EOIcount will be incremented, but then EOIcount doesn't say to me anything related to which IRQ was EOIed.
Or do we assume that this is not an example where the VM behaved legally?
The relevant part of the spec is:
For each CPU interface, the GIC architecture requires the order of the valid writes to ICC_EOIR0_EL1 and ICC_EOIR1_EL1 to be the exact reverse of the order of the reads from ICC_IAR0_EL1 and ICC_IAR1_EL1
And later in the spec:
A write to this register must correspond to the most recent valid read by this PE from an Interrupt Acknowledge Register, and must correspond to the INTID that was read from ICC_IAR0_EL1, otherwise the system behavior is UNPREDICTABLE.
So in your scenario, the Guest's actions aren't legal. The writes to ICC_EOIRn needs to match up with the reads *on that PE* of ICC_IARn.
The case where you might do something like this is where you're saving off GIC state and then later restoring it. But then you'd need to keep the system level GIC and attached CPU IFs in sync. So I restored the system level GIC's active state regs, I'd also need to restore the CPU IFs' state to match.
Fair enough about the IAR/EOIR ordering, although I think in some targets at least, EOIR without IAR first works (but I guess it doesn't matter, spec says unpredictable so it cannot be relied upon anyway).
I still think though that I am missing something regarding EOIcount, because the same example can happen with a normal IRQ life cycle.
So 32 IRQs of the same priority trigger at the same time, we put 16 into LRs, the rest pending in memory. Guest sees and ACKs all 16 of them (no EOIs yet), now IRQs are in active state and we can unlist them from the LRs (NPIE maintenance IRQ) and store them as active in memory. Next we load the other 16 IRQs (pending) into the LRs where the guest can ACK all of them. According to the rules above the guest now has to start ending the interrupts in the opposite order. With the underflow maintenance IRQ we know when all these IRQs are finalized and can load into LRs the previous batch that remained active in memory.
In the above example when does EOIcount come into play at all? Do you have a simple example for LRENPIE/EOIcount (as far as I know this maintenance interrupt is not used at all in KVM). What if the guest EOIs in different order, do we just ignore the wrong ordered EOIs? Does that mean that the hypervisor is expected to track not only the state of virtual IRQs but also the order that they were injected (and I am not talking about priorities here let's just assume that all IRQs have the same priority)?
Thank you again for the detailed responses, and sorry for my many questions. I am going back and forth through the spec, yet I still feel that I have some crucial misunderstanding about EOIcount and how it can hint to the hypervisor which unlisted IRQ was EOIed.
The thing to remember is that interrupt priority restricts what order the interrupts can be taken in. To be ack’ed an interrupt must be higher priority than the current Running Priority, and by ack’ing the interrupt the Running Priority increases. Therefore, a high priority interrupt can pre-empt a low priority interrupt, but not the other way round.
Note: In GIC lower numeric values are higher priority.
Let’s have an example for a PE with 2 List Registers (2 only because it’ll be quicker to write out).
The hypervisor has two virtual interrupts to present:
It sets up two List Registers, enters the guest, the guest reads the IAR and gets.. B. Because generally the GIC will give you the higher priority interrupt if it has a choice.
Now this means we won’t see your scenario. When we ack’ed B, the Running Priority became 0x7. Interrupt A is lower priority (numerically higher) and therefore cannot pre-empt. It won’t be signalled, or returned by a read of IAR, until software does the priority drop/write to EOIR.
To get the interesting scenario the interrupts have to arrive one at a time.
So, at this point there are four active virtual interrupts, two in the List Registers (C & D) and two spilled (A & B). There are also four active virtual priorities, giving an overall virtual running priority of 0x5.
Now the Guest will start writing to EOIR. We’ll assume the guest follows the correct order and EOICount is initially 0.
The hypervisor wants to get a picture of the over status of the virtual GIC. It knows which interrupts it presented (A, B, C and D) and what the priorities of those interrupts were. Because it knows the priorities, it knows the possible orders the interrupts could have nested. It also knows if these were ack’ed by the guest (either from when it spilled the List Registers or by looking at the List Registers now).
Question: Which, if any, interrupts are Active? And how does the Hypervisor know this?
For C and D, this is easy. The Hypervisor just needs to look at the List Registers, both of which have been set to Inactivate by the guest’s writes to EOIR.
The hypervisor also sees that the EOICount is 1 – so one of the spiled interrupts also been EOIR’ed. But which one?
That’s where priority comes in. The answer has to be that Int B was EOIR’ed, leaving Int A as the only still Active interrupt. It has to be this way because of the priorities of Int A and B. Int B could have pre-empted Int A, but not the other way round. So if the Guest is sticking to the rules, it must have Ack’ed A before B, and must therefore have EOIR’d B before A.
Thanks Martin, that was very detailed and clear.