Hello,
I am working on an IoT project, mixing C and C++, and I am having stack issues with lambdas.
The following code was compiled by gcc-arm-none-eabi-8-2018-q4-major-win32, with -Os and runs on a NUCLEO-L476RG. I monitored stack usage with Ozone.
gcc-arm-none-eabi-8-2018-q4-major-win32
typedef struct structTest { uint32_t var1; uint32_t var2; } structTest; // Test 1 int main() { dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var2 = 0; $.var2 = 24; $.var1 = 48; return $; }() ); } // Test 2 int main() { dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var1 = 0; $.var1 = 48; return $; }() ); dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var1 = 0; $.var2 = 13; $.var1 = 42; return $; }() ); }
We have some complex macros that enables use to make sure structures are used initialized, and those macros generated some code similar to the above one. "structTest $; $.var1 = 0; $.var2 = 0;" is always generated, and after the macros add the users values to the corresponding fields.
The expected behavior in case 1 and 2 was that only 8 bytes of stack were used for data. This is the case in Test 1, but it is 16 bytes for test 2.
Is there any way to keep this kind of structure but to force the compiler to reuse the stack ? -fconserve-stack and -fstack-reuse=all both had no effect.
I also can't find documentation on the optimization behavior expected for lambda functions, if anyone has a link I'll be gratefull
Hi Tamar Christina,
Thanks a lot for the in depth explanation.
If I understood correctly, the stack slots are reused and should be reused not matter the number of calls of dostuff( [&]() -> structTest{...}}, but because of the copy of an unnamed variable, GCC does not realize stack is being reused.
Strangely enough, I cannot reproduce the reuse of the stack slots.
Here is the assembly I get :
_Z14wrapper2LAMBDAv $Thumb { 08001404 PUSH {R4-R6, LR} 08001406 LDR R4, =_etext 08001408 MOV R6, R4 0800140A LDM R6!, {R0-R3} { 0800140C SUB SP, SP, #0x50 0800140E ADD R5, SP, #0x10 08001410 STM R5!, {R0-R3} 08001412 LDM.W R6, {R0-R3} 08001416 STM.W R5, {R0-R3} 0800141A ADD R3, SP, #0x20 0800141C LDM R3, {R0-R3} 0800141E STM.W SP, {R0-R3} 08001422 ADD R5, SP, #0x10 08001424 LDM.W R5, {R0-R3} 08001428 ADDS R4, #0x20 0800142A BL _Z7doStuff10TestStruct 0800142E LDM R4!, {R0-R3} 08001430 ADD R5, SP, #0x30 08001432 STM R5!, {R0-R3} 08001434 LDM.W R4, {R0-R3} 08001438 STM.W R5, {R0-R3} 0800143C ADD R3, SP, #0x50 0800143E LDMDB R3, {R0-R3} 08001442 STM.W SP, {R0-R3} 08001446 ADD R4, SP, #0x30 08001448 LDM.W R4, {R0-R3} 0800144C BL _Z7doStuff10TestStruct } 08001450 ADD SP, SP, #0x50 08001452 POP {R4-R6, PC}
Do you know if a fix is in the making, and if I should post a bug report directly to GCC (or are they already aware of this problem)?
Hi B_Cartier,
hmm you're right, on Arm it doesn't re-use the stack slots. I'm not sure why that is. But an upstream ticket to GCC would be the best course of action here.
There are two bugs here, the not re-using of the stack slot and the not removing of the dead store. The latter is a known issue, but the former I am not sure.
The not removing the dead store is a fairly old issue that affects all architectures.
Cheers,
Tamar
I will post a ticket to GCC then.
I guess that not reusing the stack slots means not removing the dead store is not a bug in this particular case, since it is not really dead anymore.
If it is an old issue we can only hope a fix is in the making, that would greatly help our project.
Thanks a lot for your time, I'll keep you posted if I get any answer from GCC if you want.
B_Cartier