This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

L2 cache error injection for instructions and Prefetch Abort

Hello,

I am using an ARM dual core Cortex A9 CPU as part of our satellite computer SoC, and I am trying to inject errors in the different cache levels of the CPU.

In particular I am trying to trigger a Prefetch Abort by corrupting instructions stored in the L2 cache, which is shared between the instructions and the data.

For this purpose I implemented the following sequence:

1) Configured all MMU pages as non-cacheable by default in the MMU translation table

2) Disabled caching of page table within inner and outer caches

3) Mapped a function to a known memory region (w/ a dedicated section in the linker script, with fixed address and size)

4) Configured the MMU page corresponding to this region as inner non-cacheable, outer non-cacheable

5) Then I perform the following error injection sequence:

a) Enable the L2 cache parity checking

a) Execute the function, so that it loads the instruction into L2 cache

b) Disable the L2 cache parity checking

c) Write all the cachelines of the function's memory region with a fixed pattern e.g. 0x5a5a5a5a

d) Enable the L2 cache parity checking

c) Execute again the function

--> In my understanding, this should normally trigger a parity error and thus a Prefetch Abort... but that is not the case in my test

I suspected that the Point of Unification between instructions and data was incorrectly set. As a recall, "the Point of Unification stands for the point at which the instruction and data caches and translation table walks of the core are guaranteed to see the same copy of a memory location. For example, a unified level 2 cache would be the point of unification in a system with Harvard level 1 caches and a TLB for caching translation table entries." Source: https://developer.arm.com/docs/den0024/latest/caches/point-of-coherency-and-unification

There are no register to define the cache level associated with the Point of Unification Uniprocessor between instructions and data.

However there are registers to define the Level of Unification Uniprocessor, which defines "the last level of cache that must be cleaned or invalidated when cleaning or invalidating to the point of unification for the Inner Shareable shareability domain."

Source: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100236_0002_00_en/zsu1423532383435.html

So I checked that the Level of Unification for the ARM Cortex A9 in our SoC through reading the Cache Level ID Register. It appears that:

- Level of Unification Uniprocessor (LoUU): 3b'001

- Level of Coherence (LoC): 3b'001

- Level of Unification Inner Shareable (LoUIS): 3b'000

The maintenance broadcast bit of the ID_MMFR3 register is set as 3b'010 which stands for cache maintenance operation behavior is only related to their own definition. For example invalidate L1 instruction cache to Point of Unification will indeed invalidate instructions up to the PoU.

In otherwords, since the LoUU is L1 cache it means that the PoU is the L2 cache. As a result I should be able to corrupt the instructions in L2 cache.

So my question is: why modifying the instruction stored in L2 cache with a store issued in the data pipeline (without flushing it DDR for synchronization with the instruction pipeline) does not corrupt it and trigger an abort?

Regards,

Florian

0 flongnos over 3 years ago
Hello everyone,

I found out what the issue was: basically the function I want to corrupt was not where I excepted.

To be specific, I tried to allocate a range within the linker for an injection zone, and called this section "injection_section":

.injection_section 0x100000 : { . += 0x100000; }

Then in the bare metal test I declared the function to be placed in this section:

void __attribute__ section(("injection_section")) test_function(void);

I modified the cacheline at the address 0x100000. But then I found out that the test_function was not actually there, but after this section (> 0x200000).

So could someone help to force the test function to be in this section?

Thank you.

Florian
Cancel
Up 0 Down

Cancel