Hello ARM experts,
I am working on a hypervisor running at EL2 on the Rockchip RK3588 platform (Cortex-A55 + Cortex-A76). In this setup, I use the PMU from EL2 to count how many times the guest OS (running at EL0/EL1) enters IRQ exceptions. The event being monitored is the architectural IRQ exception taken event.
During testing, I observed a significant difference in PMU behavior between the A55 and A76 clusters.
Observed behavior • Cortex-A55
The PMU IRQ-exception count matches the actual number of IRQ exceptions triggered in the guest.
• Cortex-A76
The PMU IRQ-exception count is consistently higher than the actual number of guest IRQs. In my test environment, the hypervisor at EL2 only receives periodic EL2 timer interrupts, and there are no other EL2-level interrupts.
Interestingly:
The initial excess counts (before enabling EL2 exception counting) on A76
And the extra counts added after enabling EL2 exception counting
are both approximately equal to the number of EL2 timer interrupts.
To investigate further, I enabled PMU counting for EL2-level exceptions. After doing so, the total PMU count increased by an amount roughly equal to the number of EL2 timer IRQs — which is expected.
However, the original “excess” counts (observed before enabling EL2 exception counting) remain unexplained, despite closely matching the EL2 timer interrupt count.
This behavior is only observed on Cortex-A76, while Cortex-A55 does not show this issue.
Questions
Are there known differences in PMU exception event behavior between Cortex-A55 and Cortex-A76, particularly regarding IRQ exception counting at different exception levels?
On Cortex-A76, can the IRQ-exception PMU event be triggered by conditions related to EL2 timer interrupts, even when monitoring only EL0/EL1 exceptions from EL2?
Is it expected that A76 might count internal EL2-related exception transitions differently from A55, causing both the initial excess counts and the additional EL2-exception counts to align with the EL2 timer IRQ count?
Are there relevant errata or PMU modeling differences for Cortex-A76 related to exception-entry events?
Any clarification or guidance from ARM engineers would be greatly appreciated. Thank you!