We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I'm currently developing software for an ARM Cortex-M4 MCU. We rely on a third-party SDK, which is not compatible with ARM compiler 6, therefore we still use ARM compiler 5.
I stumbled upon a weird bug occurring with both, v5.06 and v5.07 on Windows, when optimisation level -O2 is activated.
What I observed happens in the context of a large project and it seems rather difficult to construct a minimal example, so I just describe what seems to happen using "imaginary" code...
Suppose I have the following function:
typedef struct { ... } some_struct_t; static void do_stuff(void) { some_struct_t data; [...] memset(&data, 0, sizeof(data)); [...] }
Upon function entry, a PUSH {R0-R8, PC} is called. Now I would expect the compiler to update the stack pointer to reserve space for the local data structure, but this does not happen. Instead, the address passed to memset is [Stack Pointer + 4], which results in overwriting the pushed registers on the stack.
The calling function had stored a memory address in R4 before calling do_stuff(). Upon leaving do_stuff(), POP {R0-R8, LR} is issued and R4 now contains whatever has been written to the data structure at the particular offset, which corresponds to the stacked R4.
Has someone seen this behaviour or has a reasonable explanation for this? To me it kind of looks like a weird code size optimisation gone wrong. Just using PUSH with 10 registers to also reserve some space on the stack. Without more detailed knowledge of how the compiler/optimizer work, I would expect the compiler to not use PUSH to reserve space for local variables, but instead call SUB sp,sp,[number of required bytes] after the PUSH instruction (which is what the compiler seems to normally emit).
I did manage to work around the issue by using an initializer for the struct instead of memset(). This resulted in overwriting of stacked R0-R3 (instead of R1-R4), which *seems* to not cause any issues as the value in stacked R4 *seems* to be the only stacked data actually used after leaving the function, but this leaves me with quite an uneasy feeling as I cannot explain why/when/where this behaviour occurs and can cause any further issues...