We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
According to ARM Architecture Procedure Call Standard (AAPCS) on the ARMv6-M, and ARMv7-M architecture in it says:
"Although the processor hardware allows SP to be at any word aligned address at function boundaries, standard programming practice requires C program code to ensure that the SP is at a 64-bit (doubleword) aligned address."
What does it mean that the Stack pointer has to be at a 64 bit aligned address?
Hello,
as far as my memory is correct, the 8-byte alignment of the exception stack frame would be required only for Cortex-M7.
The other Cortex-M core would be enough with 4-byte alignment.The reason of 8-byte alignment of Cortex-M7 would be guessed as that the internal AXI bus width is 64 bit.
According to "ARM®v7-M Architecture Reference Manual [ARM DDI 0403E.b (ID120114)]", B1.5.7 describes as the following.
B1.5.7 Stack alignment on exception entry
The ARMv7-M architecture guarantees that stack pointer values are at least 4-byte aligned.
However, some software standards require the stack pointer to be 8-byte aligned, and the architecture can enforce this alignment.The CCR.STKALIGN bit indicates whether, as part of an exception entry, the processor aligns the SP to 4 bytes, orto 8 bytes. It is IMPLEMENTATION DEFINED whether this bit is:- RW, in which case its reset value is IMPLEMENTATION DEFINED.- RO, in which case it is RAO, indicating 8-byte SP alignment
Also, according to the Cortex-M4 TRM, CCR.STKALIGN is R/W although the initial value is 1 (i.e. 8-byte align).According to the Coetex-M7 TRM, CCR.STKALIGN is R/O and the initial value is 1 (i.e. 8-byte align).As for Cortex-M0+ and Cortex-M3 TRMs, it is uncertain whether CCR.STKALIGN can be modified or not, but the initial value is 1. I think they can be modified.
Best regards,
Yasuhiko Koumoto.
Hi Yasuhiko,
In Cortex-M0 and Cortex-M0+ processors, CCR.STKALIGN is fixed to 1. So it always enforce 8 byte stack alignment for exception stack frames.
When setting up the initial SP values, the values should be 8 bytes aligned.
When having function calls, the SP value at function call boundaries should be 8 byte aligned. It is okay to have 4 byte alignment in the middle of a function, as long as the SP value is adjust back to 8 bytes aligned before the function is ended or a function call is made.
If a function is coded in assembly language, and inside this function it calls another assembly function which you knew that it does not require 8 byte stack alignment, you can have 4 byte stack alignment at function call in such case. But if you are calling a C function, then 8 byte stack alignment is required, as the C compiler might make assumption of SP value in pointer handling.
Hope this helps.
regards,
Joseph
I'm pretty certain LDRD only required 4-byte alignment whenever it is implemented but yes I think it probably is a good idea try and get double word alignment in case the bus supports 8-byte transfers. It also just looks wrong having 64-bit items split over a double word boundary.
For some legacy processors the LDRD in ARM state has 8 byte alignment requirement. But no such requirement for LDRD in ARMv7-M.
Thanks for that. As they say it's the things you know but ain't so that catch you.
Yes, indeed, this is helpful and good information.
-So there will be no problems with alignment faults regarding LDRD if SP is not aligned on an 8-byte boundary.
I would expect that keeping the 8-byte boundary might help when it comes to cache and speed.
I read about this recently, as I saw some redundant re-alignment code in a Reset_Handler.
For Cortex-M3, Cortex-M4 and Cortex-M7, it'll be beneficial to align the stack on an 8-byte boundary, since the LDRD instruction would require this.
In other words: If you pass a 64-bit integer on the stack to a subroutine, for instance as the third or fourth parameter, then it'll be stored on the stack.
The compiler would most likely use LDRD / STRD for reading/writing the 64-bit integer.
There is probably no benefit in aligning the stack on an 8-byte boundary on the Cortex-M0 and Cortex-M0+, except from if the compiler is making assumptions like jyiu says. I would expect this to happen if the compiler generates forward-compatible code (that is sometimes necessary if two different cores; eg. Cortex-M0 and Cortex-M4 are implemented on the same chip and they share code).
You can find more information about this in the ARM Information Center.
As for my own use, my own subroutines / code does not require an 8-byte aligned stack pointer (not even my C-code).
Unfortunately, it would not be 100% safe to turn it off, unless your compiler has a switch to ignore 8-byte stack-alignment and you rebuild your C-libraries so you can link against compatible libraries.
(The calling-convention must match all linked code, otherwise you may experience that your code suddenly fail "for no apparent reason").
Yes. For processors with 64-bit interface (e.g. Cortex-M7), having such data alignment would be more efficient. It could also made debugging easier (e.g. when setting up data watchpoint).
It also avoids corner cases where a 64-bit access go across boundaries of different memory types, which can result in unpredictable behavior according the the architecture specification.
Hello Joseph Yiu,
thank you for your detailed explanation.
I have been misunderstood.
My understanding had been that the case of CCR.STKALIGN=1 made the stack pointer at interrupt or exceptions aligned to 8 byte boundary.
That's correct. CCR.STKALIGN only affect exception stack frames. In Cortex-M0, Cortex-M0+ and Cortex-M7 processors, this bit is fixed to 1 which always ensure that the exception stack frames are double word aligned.
However, software still have to implemented correctly to be AAPCS compliant.