GIC 3.0's SGI interrupt latency seems much bigger than GIC 2.0

thomas_cp over 6 years ago

Hi,

I tested SGI interrupt latency, it seems that GIC3.0's SGI interrupt latency are much bigger than GIC2.0.

How to test:

GIC3.0:

1. read timestamp(t01)

2. core0 write ICC_SGI0R_EL1 to trigger core1, read timestamp(t02)

3. isr in core1, read timestamp(t03)

GIC2.0:

1. read timestamp(t11)

2. core0 write GICD_SGIR to trigger core1, read timestamp(t12)

3. isr in core1, read timestamp(t13)

Result is (had think about the time of reading timestamp):

1. (t02 - t01) nearly 3 times bigger than (t12 - t11) which means system register cost more time than access to memory mapped memory.

2. (t03 - t02) nearly 2 times bigger than (t13 - t12) which means interrupt latency of GIC3.0 are much bigger than GIC 2.0

Is this normal? Any help will be appreciated! Thanks!

BR,

Peng

Parents

0 42Bastian Schick over 6 years ago in reply to thomas_cp

What are the times you are measuring?
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Reply

0 42Bastian Schick over 6 years ago in reply to thomas_cp

What are the times you are measuring?
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Children

0 thomas_cp over 6 years ago in reply to 42Bastian Schick

You can think it is a hardware timer counter.
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 thomas_cp over 6 years ago in reply to 42Bastian Schick

I changed timer to ARM counter CNTVCT_EL0, which is more common.

This is new tests result:

GIC3.0:

from write trigger to int enter(assembly entry) = 0.96us

from int enter to int service = 0.48us
GIC2.0:

from write trigger to int enter(assembly entry) = 0.15us

from int enter to int service = 0.58us

So, we can see from int enter to int service are nearly same.
But from write trigger to int enter(assembly entry) of GIC3.0 are much bigger than GIC2.0.

Did ARM test these parameters? I think maybe it is related to the hardware architecture?
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 42Bastian Schick over 6 years ago in reply to thomas_cp

thomas_cp said:

GIC3.0:

from write trigger to int enter(assembly entry) = 0.96us

thomas_cp said:

GIC2.0:

from write trigger to int enter(assembly entry) = 0.15us

Wow, yes this is dramatic. And the software is identical? Cache, MMU and bus settings? Just asking because I just cannot believe that there could be such a difference in the GIC.
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 42Bastian Schick over 6 years ago in reply to thomas_cp

I tried it on a Zynq US+, but this is a GIC V2 as I see now. I got 0.29us from trigger to IS-process.

I have an i.MX8 on desk, but NXP is not clear which GIC they use: They write about v4 and v3?!
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 thomas_cp over 6 years ago in reply to 42Bastian Schick

You can check GICD_PIDR2 register.

GICD_PIDR2 bits [7:4] ArchRev will tell whether it is v3 or v4 or v2.

By the way, what timer did you use for counting?
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 thomas_cp over 6 years ago in reply to 42Bastian Schick

1. Software is identical.

2. I confirmed that both cache are same, same structure(1st separate, 2nd unified), same attribute(WB/RA/WA), same mount(1st 4way-128sets-64B)

3. MMU and TLB are same for all CortexA53 series, and MMU map are same for instruction and data.

4. BUS setting

This may related to many clocks and configurations, I am not familiar with these. So I tested the memory access time, alloc a non-cache memory, read and write, then compare. I found that board with GIC3.0 are slower than board with GIC2.0 by 10%. Then, I tested write time cost, two write and read operations on board with GIC3.0 cost 0.36us, two on board with GIC2.0 cost 0.30us. These values are worst value with no cache hit.

Why two write operations, because after trigger on core0 to irq time record on core1, there are irq response time + two write and read.

So, GIC3.0 in worst situation, from write trigger to int enter(assembly entry) = 0.96us-0.36us=0.6us

GIC2.0 in best situation, from write trigger to int enter(assembly entry) = 0.15us-0=0.15us

There are still four times difference, think about the accuracy, maybe 2-4 times?
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 42Bastian Schick over 6 years ago in reply to thomas_cp

I used the generic timer (CNTPCNT_EL0) running at 100MHz.
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 thomas_cp over 6 years ago in reply to 42Bastian Schick

Did you have any new test results?
Cancel
Up 0 Down

Reply

Accept answer

Cancel
0 42Bastian Schick over 6 years ago in reply to thomas_cp

Sorry, no. I have to make a BSP first for the i.MX8.
Cancel
Up 0 Down

Reply

Accept answer

Cancel