Arm Development Platforms forum ldr and fmla instruction time consumption issue.

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ldr and fmla instruction time consumption issue.

姑苏风河 over 6 years ago

Hi awesome guy,

I have a question on ARM A53 platform, and I needs your help!

8 ldr operations which using uncorrelated Qn register and 8 fmla operations which also using uncorrelated Qn reigster, codes shows as follows,

and

the address of X1 and X2 are on stack. why ldr loop will consume double time of the fmla loop?

I have refer to doc "Cortex_A57_Software_Optimization_Guide_external.pdf", ldr lantency is 5, and fmla is 10.