Compilers and Libraries forum How to force the branch target to be 4 byte aligned? Any compiler options to do that? I am using arm A15.

State Suggested Answer
Locked Locked
Replies 6 replies
Answers 2 answers
Subscribers 19 subscribers
Views 3591 views
Users 0 members are here

Options

Related

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to force the branch target to be 4 byte aligned? Any compiler options to do that? I am using arm A15.

YP.Lu over 4 years ago

I have a tight loop listed as follow. thumb-2.

PC=0x700abdfe DISASS="MRRC p15,#0,r0,r1,c14"

PC=0x700abe02 DISASS="SUBS r0,r0,r2"

PC=0x700abe04 DISASS="SBC r1,r1,r3"

PC=0x700abe08 DISASS="SUBS r0,r0,r4"

PC=0x700abe0a DISASS="SBCS r1,r1,r5"

PC=0x700abe0c DISASS="BCC {pc}-0xe ; 0x700abdfe"

The target address,0x700abdfe, is not 4 byte aligned, the performance hit is huge.

How to force the branch target to be 4 byte aligned? Any compiler options to do that? I am using arm A15.

Top replies

Ronan Synnott over 4 years ago in reply to YP.Lu +1 suggested

Hello again, from a simple test I cannot replicate such behavior. It is probably best to raise an official support case from the support menu above. You may need to share the code as well as the system...

Parents

0 YP.Lu over 4 years ago in reply to Ronan Synnott

Thank you very much Ronan for getting back to me on this!

Also, we are using a dual core A15 model pre-silicon. The question is why when this happens (slowness of the tight loop), the timer register slows down too? Ultimately, this function is waiting for a certain amount of timer increments to pass. If there is efficiency lost in few cpu cycles, it seems like it should be a don’t care? We’re looping around a *lot* of times just waiting for the timer to expire.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 YP.Lu over 4 years ago in reply to Ronan Synnott

Thank you very much Ronan for getting back to me on this!

Also, we are using a dual core A15 model pre-silicon. The question is why when this happens (slowness of the tight loop), the timer register slows down too? Ultimately, this function is waiting for a certain amount of timer increments to pass. If there is efficiency lost in few cpu cycles, it seems like it should be a don’t care? We’re looping around a *lot* of times just waiting for the timer to expire.
Cancel
Vote up 0 Vote down

Cancel

Children

0 YP.Lu over 4 years ago in reply to YP.Lu

What we’d really be asking is “does this alignment inefficiency cause the ARM fast model to take inordinate amounts of time to simulate?"
Cancel
Vote up 0 Vote down

Cancel
0 Ronan Synnott over 4 years ago in reply to YP.Lu

Hello again, from a simple test I cannot replicate such behavior.

It is probably best to raise an official support case from the support menu above. You may need to share the code as well as the system you are running to help investigate thoroughly.
Cancel
Vote up +1 Vote down

Cancel
0 YP.Lu over 4 years ago in reply to Ronan Synnott

Okay, I will open an official case. Thank you very much Ronan!
Cancel
Vote up 0 Vote down

Cancel
0 YP.Lu over 4 years ago in reply to YP.Lu

Isn't this one official? I opened this case through the menu -> ...
Cancel
Vote up 0 Vote down

Cancel