We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi,
I am reading the A53 MP Core doc.
My question is related to instruction preloading in aarch64.
In case of a very large block of code with no function calls, I want to make sure the L1 cache is always filled.
Question 1: Will the PLI instruction first check L2 before trying to fetch from main memory?
Q2: is it a sustainable strategy to call the instruction multiple times at specified offsets? I should load far instructions in L2 first from main memory and then redo it later from L2 to L1?
Thanks
Q1: For CA-53, PLI is implemented as a NOP.
Q2: It may be possible to be "sustainable". But the instructions you preload may be evicted, too.
CA-53 T.R.M. says
The PRFMs also enable targeting of a prefetch to the L2 cache. When this is the case, a request is sent to L2 to start a linefill, and then the instruction can retire, without any data being returned to L1
Ah!
Thanks Zhifei,
By PLI I meant PRFM yes. My mistake (used to 32 bit arm). So I take that PRFM is not a NOP and instruction cache can be filled.
I will use that instruction to fetch many data/instructions as possible (especially instructions following branches), but not too many. Ideally, far ahead, caching to L2 may be of benefit if no other apps are executed (<-> cache eviction may occur if it is too far ahead)
Should be OK.
If you see something wrong in my approach, please shoot ;)