Hi,
I tested SGI interrupt latency, it seems that GIC3.0's SGI interrupt latency are much bigger than GIC2.0.
How to test:
GIC3.0:
1. read timestamp(t01)
2. core0 write ICC_SGI0R_EL1 to trigger core1, read timestamp(t02)
3. isr in core1, read timestamp(t03)
GIC2.0:
1. read timestamp(t11)
2. core0 write GICD_SGIR to trigger core1, read timestamp(t12)
3. isr in core1, read timestamp(t13)
Result is (had think about the time of reading timestamp):
1. (t02 - t01) nearly 3 times bigger than (t12 - t11) which means system register cost more time than access to memory mapped memory.
2. (t03 - t02) nearly 2 times bigger than (t13 - t12) which means interrupt latency of GIC3.0 are much bigger than GIC 2.0
Is this normal? Any help will be appreciated! Thanks!
BR,
Peng
What are the times you are measuring?
You can think it is a hardware timer counter.
I changed timer to ARM counter CNTVCT_EL0, which is more common.
This is new tests result:
from write trigger to int enter(assembly entry) = 0.96us
from int enter to int service = 0.48usGIC2.0:
from write trigger to int enter(assembly entry) = 0.15us
from int enter to int service = 0.58us
So, we can see from int enter to int service are nearly same.But from write trigger to int enter(assembly entry) of GIC3.0 are much bigger than GIC2.0.
Did ARM test these parameters? I think maybe it is related to the hardware architecture?
thomas_cp said:
Wow, yes this is dramatic. And the software is identical? Cache, MMU and bus settings? Just asking because I just cannot believe that there could be such a difference in the GIC.
I tried it on a Zynq US+, but this is a GIC V2 as I see now. I got 0.29us from trigger to IS-process.
I have an i.MX8 on desk, but NXP is not clear which GIC they use: They write about v4 and v3?!
You can check GICD_PIDR2 register.
GICD_PIDR2 bits [7:4] ArchRev will tell whether it is v3 or v4 or v2.
By the way, what timer did you use for counting?
1. Software is identical.
2. I confirmed that both cache are same, same structure(1st separate, 2nd unified), same attribute(WB/RA/WA), same mount(1st 4way-128sets-64B)
3. MMU and TLB are same for all CortexA53 series, and MMU map are same for instruction and data.
4. BUS setting
This may related to many clocks and configurations, I am not familiar with these. So I tested the memory access time, alloc a non-cache memory, read and write, then compare. I found that board with GIC3.0 are slower than board with GIC2.0 by 10%. Then, I tested write time cost, two write and read operations on board with GIC3.0 cost 0.36us, two on board with GIC2.0 cost 0.30us. These values are worst value with no cache hit.
Why two write operations, because after trigger on core0 to irq time record on core1, there are irq response time + two write and read.
So, GIC3.0 in worst situation, from write trigger to int enter(assembly entry) = 0.96us-0.36us=0.6us
GIC2.0 in best situation, from write trigger to int enter(assembly entry) = 0.15us-0=0.15us
There are still four times difference, think about the accuracy, maybe 2-4 times?
I used the generic timer (CNTPCNT_EL0) running at 100MHz.
Did you have any new test results?
Sorry, no. I have to make a BSP first for the i.MX8.