Support forums

Architectures and Processors forum Efficient uasage of PLD instruction in combination with Load instructions?

State Accepted Answer
Locked Locked
Replies 3 replies
Subscribers 349 subscribers
Views 6641 views
Users 0 members are here

Options

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Efficient uasage of PLD instruction in combination with Load instructions?

josephgopu over 11 years ago

Hi all, after a long time I'm back to forum with a question

I'm posting this question with some pseudo code

for(i=0;i<100;i++)

{

instruction1

instruction2

instruction3

.................

instructionA : pld [r0]

..................

instructionB :vld1.16 {d0-d3},[r0]!

..................

InstructionN

}

Let me describe my understanding of pld instruction, correct me if I was wrong.....

pld instruction will give a hint to the processor that in near future we need the data at address r0 so that it may fill the cache lines with the required data from r0 to avoid cache miss penalties, but it is not compulsory sometimes processor may ignore it also....{cache line size = 8words = 32 bytes, in 32kb cache A9 processor, I know cache sizes are configurable}

I want to know below details

1.How many instructions ahead we have put pld [r0] before vld1.16 {d0-d3},[r0]! to see the better performane {avoiding cache miss penalties} on hardware like panda board ? like

3 instructions or 4 instructions ahead.......

2.when ever processor is excuting pld [r0] instruction how many cache lines will filled with data only 1-cache line or more?

will it be the same case for PLDW also with VST.16

ex : PLDW [r1]

.................

VST.16 {d0-d3},[r1]!

What about PLI , how can specify the address reg for PLI instruction which contains address of instructions