中文社区论区 A15 上使用pld指令降低性能

State Not Answered
Locked Locked
Replies 8 replies
Subscribers 5 subscribers
Views 18294 views
Users 0 members are here

Options

Related

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

A15 上使用pld指令降低性能

siman over 9 years ago

在A15上使用PLD 指令比不用PLD指令优化效果差，为什么会出现这种情况? 按理说，PLD是提升cache hit的概率, 这样的话，CPU处理的性能应该会提升，但是测试的情况是没有提升。

以下是我编写的memcpy汇编代码：

loop:

vldm r1!, {d0-d7}

vldm r1!, {d16-d23}

pld [r1, #0x0]

pld [r1, #0x40]

vstm r0!, {d0-d7}

vstm r0!, {d16-d23}

subs r2, #0x80

bgt loop

bx lr

Parents

0 daith over 9 years ago in reply to Song Bin 宋斌

I've seen preloading xc0 in front but not x100, one thing that worries me is that extra unnecessary fetches are done, I'd put in a check that the data will be required.
Cancel
Up 0 Down

Cancel

Reply

0 daith over 9 years ago in reply to Song Bin 宋斌

I've seen preloading xc0 in front but not x100, one thing that worries me is that extra unnecessary fetches are done, I'd put in a check that the data will be required.
Cancel
Up 0 Down

Cancel

Children

No data