According to ARM Architecture Procedure Call Standard (AAPCS) on the ARMv6-M, and ARMv7-M architecture in it says:
"Although the processor hardware allows SP to be at any word aligned address at function boundaries, standard programming practice requires C program code to ensure that the SP is at a 64-bit (doubleword) aligned address."
What does it mean that the Stack pointer has to be at a 64 bit aligned address?
Hello,
as far as my memory is correct, the 8-byte alignment of the exception stack frame would be required only for Cortex-M7.
The other Cortex-M core would be enough with 4-byte alignment.The reason of 8-byte alignment of Cortex-M7 would be guessed as that the internal AXI bus width is 64 bit.
According to "ARM®v7-M Architecture Reference Manual [ARM DDI 0403E.b (ID120114)]", B1.5.7 describes as the following.
B1.5.7 Stack alignment on exception entry
The ARMv7-M architecture guarantees that stack pointer values are at least 4-byte aligned.
However, some software standards require the stack pointer to be 8-byte aligned, and the architecture can enforce this alignment.The CCR.STKALIGN bit indicates whether, as part of an exception entry, the processor aligns the SP to 4 bytes, orto 8 bytes. It is IMPLEMENTATION DEFINED whether this bit is:- RW, in which case its reset value is IMPLEMENTATION DEFINED.- RO, in which case it is RAO, indicating 8-byte SP alignment
Also, according to the Cortex-M4 TRM, CCR.STKALIGN is R/W although the initial value is 1 (i.e. 8-byte align).According to the Coetex-M7 TRM, CCR.STKALIGN is R/O and the initial value is 1 (i.e. 8-byte align).As for Cortex-M0+ and Cortex-M3 TRMs, it is uncertain whether CCR.STKALIGN can be modified or not, but the initial value is 1. I think they can be modified.
Best regards,
Yasuhiko Koumoto.
Hi Yasuhiko,
In Cortex-M0 and Cortex-M0+ processors, CCR.STKALIGN is fixed to 1. So it always enforce 8 byte stack alignment for exception stack frames.
When setting up the initial SP values, the values should be 8 bytes aligned.
When having function calls, the SP value at function call boundaries should be 8 byte aligned. It is okay to have 4 byte alignment in the middle of a function, as long as the SP value is adjust back to 8 bytes aligned before the function is ended or a function call is made.
If a function is coded in assembly language, and inside this function it calls another assembly function which you knew that it does not require 8 byte stack alignment, you can have 4 byte stack alignment at function call in such case. But if you are calling a C function, then 8 byte stack alignment is required, as the C compiler might make assumption of SP value in pointer handling.
Hope this helps.
regards,
Joseph
I'm pretty certain LDRD only required 4-byte alignment whenever it is implemented but yes I think it probably is a good idea try and get double word alignment in case the bus supports 8-byte transfers. It also just looks wrong having 64-bit items split over a double word boundary.
For some legacy processors the LDRD in ARM state has 8 byte alignment requirement. But no such requirement for LDRD in ARMv7-M.
Thanks for that. As they say it's the things you know but ain't so that catch you.
Yes, indeed, this is helpful and good information.
-So there will be no problems with alignment faults regarding LDRD if SP is not aligned on an 8-byte boundary.
I would expect that keeping the 8-byte boundary might help when it comes to cache and speed.
I read about this recently, as I saw some redundant re-alignment code in a Reset_Handler.
For Cortex-M3, Cortex-M4 and Cortex-M7, it'll be beneficial to align the stack on an 8-byte boundary, since the LDRD instruction would require this.
In other words: If you pass a 64-bit integer on the stack to a subroutine, for instance as the third or fourth parameter, then it'll be stored on the stack.
The compiler would most likely use LDRD / STRD for reading/writing the 64-bit integer.
There is probably no benefit in aligning the stack on an 8-byte boundary on the Cortex-M0 and Cortex-M0+, except from if the compiler is making assumptions like jyiu says. I would expect this to happen if the compiler generates forward-compatible code (that is sometimes necessary if two different cores; eg. Cortex-M0 and Cortex-M4 are implemented on the same chip and they share code).
You can find more information about this in the ARM Information Center.
As for my own use, my own subroutines / code does not require an 8-byte aligned stack pointer (not even my C-code).
Unfortunately, it would not be 100% safe to turn it off, unless your compiler has a switch to ignore 8-byte stack-alignment and you rebuild your C-libraries so you can link against compatible libraries.
(The calling-convention must match all linked code, otherwise you may experience that your code suddenly fail "for no apparent reason").
Yes. For processors with 64-bit interface (e.g. Cortex-M7), having such data alignment would be more efficient. It could also made debugging easier (e.g. when setting up data watchpoint).
It also avoids corner cases where a 64-bit access go across boundaries of different memory types, which can result in unpredictable behavior according the the architecture specification.
Hello Joseph Yiu,
thank you for your detailed explanation.
I have been misunderstood.
My understanding had been that the case of CCR.STKALIGN=1 made the stack pointer at interrupt or exceptions aligned to 8 byte boundary.
That's correct. CCR.STKALIGN only affect exception stack frames. In Cortex-M0, Cortex-M0+ and Cortex-M7 processors, this bit is fixed to 1 which always ensure that the exception stack frames are double word aligned.
However, software still have to implemented correctly to be AAPCS compliant.