Hi,
I am reading the A53 MP Core doc.
My question is related to instruction preloading in aarch64.
In case of a very large block of code with no function calls, I want to make sure the L1 cache is always filled.
Question 1: Will the PLI instruction first check L2 before trying to fetch from main memory?
Q2: is it a sustainable strategy to call the instruction multiple times at specified offsets? I should load far instructions in L2 first from main memory and then redo it later from L2 to L1?
Thanks
Q1: For CA-53, PLI is implemented as a NOP.
Q2: It may be possible to be "sustainable". But the instructions you preload may be evicted, too.
CA-53 T.R.M. says
The PRFMs also enable targeting of a prefetch to the L2 cache. When this is the case, a request is sent to L2 to start a linefill, and then the instruction can retire, without any data being returned to L1
Ah!
Thanks Zhifei,
By PLI I meant PRFM yes. My mistake (used to 32 bit arm). So I take that PRFM is not a NOP and instruction cache can be filled.
I will use that instruction to fetch many data/instructions as possible (especially instructions following branches), but not too many. Ideally, far ahead, caching to L2 may be of benefit if no other apps are executed (<-> cache eviction may occur if it is too far ahead)
Should be OK.
If you see something wrong in my approach, please shoot ;)