We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I want to improve the performance of some game in our platform. I have heard of cache lock down feature to achieve performance improvement. Please suggest me how to use this feature in ARM Cortex-A7 for performance improvement.
It depends a little on the hardware you have, so I would check the TRM.
For example – here are the instructions for the L220 cache controller:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0329l/Beieiiab.html
Regards,
Pete
Hi peter,
Due to TrustZone security extension of Cortex-A processor, the cache lines tagged as secure can't be replaced by non-secure line fill and
accessed by non-secure read/write. So do we need to clear and invalidate cache at the end of secure world software operation to
avoid that normal world software only uses part of the cache?
Best Regards.
Hi Wangyong,
the cache lines tagged as secure can't be replaced by non-secure line fill and accessed by non-secure read/write.
Half right =)
TrustZone guarantees that the non-secure environment cannot read or write secure lines. However it does not guarantee non-eviction of secure lines by non-secure accesses. The line fill policy is fully dynamic - it's not a statically partitioned cache, and non-secure line fill can evict secure entries (and visa versa).
HTH, Pete
Hi Peter/Wangyong,
I have one more question regarding cache lock down. Lock down mechanism can be used oonly for L1 (I-cache/D-cache)/ L2 cache or both L1 and L2. In Cortex A7 TRM I found following line in L1 Instruction cache controller section.
no lockdown support
So does it mean that only L1 doesn't support lock down? Can we use lock down for L2 cache? Please help me to clarfiy it. Thanks a lot..
Anshul
Hi Peter,
Thanks a lot. I misunderstand about the cache eviction and I also find description in TrustZone whitepaper according to your explanation.
Caches
It is a desirable feature of any high performance design to support data of both security states in the caches. This removes the need for a cache flush when switching between worlds, and enables high performance software to communicate over the world boundary. To enable this the L1, and where applicable level two and beyond, processor caches have been extended with an additional tag bit which records the security state of the transaction that accessed the memory.
The content of the caches, with regard to the security state, is dynamic. Any non-locked down cache line can be evicted to make space for new data, regardless of its security state. It is possible for a Secure line load to evict a Non-secure line, and for a Non-secure line load to evict a Secure line.
L1 (I-cache/D-cache)/ L2 cache of Cortex-A7 don't support lockdown. TRM just does't describe this in detail.
Hi Peter/Wangyong
Is it possible to check contents of cache at run time? i.e In case of any problem in system I want to check whether cache contents are in sync with Main memory or not? How can I store cache contents in RAM? If it possible then please let me the procedure. Thanks a lot!!
Not easily - you're trying to make visible something CPUs try very hard to make invisible.
If you hit what you think are coherency problems then you can try inserting cache cleans of the entire cache, and if the problems go away then that is likely your problem. Another approach I've used in the past, if you know the address range which is causing problems, is to use a mater outside of the CPU (such as a DMA engine) to create a copy of the main memory contents, which the CPU can then read back and compare against it's view of the original data.
Thanks a lot for your reply.
As you know it's very hard to reproduce cache coherency problems. I am working of Android and till date I suspect that 2 problems occur due to cache coherency but don't have any proof to confirm the same. So what I want is if exception occurs then I copy the data of cache into RAM. By doing this I can check the data of cache with RAM and confirm whether it is because of cache coherency or any other reason. Here I am not sure about address range also. So can you suggest how to analyze such problems if we can't dump cache contents in RAM.
While I haven't examine it in depth, it seems that the TLB cache seems to persist Secure accesses for a long time. I copied the normal world boot code via secure world and do not flush the secure TLB. After one day of use, the initial secure copies are still present in the TLB cache. Either the normal world OS is avoiding section entries or it seems that secure TLB entries on the Cortex-A5 persist for some time. I also have a 'NULL' TLB entry from the secure OS, so it has been interesting to dump the TLBs. I definitely don't think the eviction of secure world entries is 'standard'. Most definitely, the normal world OS accesses the same boot sections and no normal world entries are allocated.
Also, the eviction of the secure world lines makes them susceptible to the same attacks discussed in this hyper-threading paper.