This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Arm GCC lambda optimization

Hello,

I am working on an IoT project, mixing C and C++, and I am having stack issues with lambdas.

The following code was compiled by gcc-arm-none-eabi-8-2018-q4-major-win32, with -Os and runs on a NUCLEO-L476RG. I monitored stack usage with Ozone.

typedef struct structTest
{
    uint32_t var1;
    uint32_t var2;
} structTest;

// Test 1
int main()
{
    dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var2 = 0; $.var2 = 24; $.var1 = 48; return $; }() );
}

// Test 2
int main()
{
    dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var1 = 0; $.var1 = 48; return $; }() );

    dostuff( [&]() -> structTest{ structTest $; $.var1 = 0; $.var1 = 0; $.var2 = 13; $.var1 = 42; return $; }() );
}

We have some complex macros that enables use to make sure structures are used initialized, and those macros generated some code similar to the above one. "structTest $; $.var1 = 0; $.var2 = 0;" is always generated, and after the macros add the users values to the corresponding fields.

The expected behavior in case 1 and 2 was that only 8 bytes of stack were used for data. This is the case in Test 1, but it is 16 bytes for test 2.

Is there any way to keep this kind of structure but to force the compiler to reuse the stack ? -fconserve-stack and -fstack-reuse=all both had no effect.

I also can't find documentation on the optimization behavior expected for lambda functions, if anyone has a link I'll be gratefull

Parents
  • Hi Tamar Christina,

    Thanks a lot for the in depth explanation.

    If I understood correctly,  the stack slots are reused and should be reused not matter the number of calls of dostuff( [&]() -> structTest{...}}, but because of the copy of an unnamed variable, GCC does not realize stack is being reused.

    Strangely enough, I cannot reproduce the reuse of the stack slots.

    Here is the assembly I get :

    _Z14wrapper2LAMBDAv
    $Thumb
    {
     08001404   PUSH         {R4-R6, LR}
     08001406   LDR          R4, =_etext            
     08001408   MOV          R6, R4
     0800140A   LDM          R6!, {R0-R3}
    {
     0800140C   SUB          SP, SP, #0x50
     0800140E   ADD          R5, SP, #0x10
     08001410   STM          R5!, {R0-R3}
     08001412   LDM.W        R6, {R0-R3}
     08001416   STM.W        R5, {R0-R3}
     0800141A   ADD          R3, SP, #0x20
     0800141C   LDM          R3, {R0-R3}
     0800141E   STM.W        SP, {R0-R3}
     08001422   ADD          R5, SP, #0x10
     08001424   LDM.W        R5, {R0-R3}
     08001428   ADDS         R4, #0x20
     0800142A   BL           _Z7doStuff10TestStruct
     0800142E   LDM          R4!, {R0-R3}
     08001430   ADD          R5, SP, #0x30
     08001432   STM          R5!, {R0-R3}
     08001434   LDM.W        R4, {R0-R3}
     08001438   STM.W        R5, {R0-R3}
     0800143C   ADD          R3, SP, #0x50
     0800143E   LDMDB        R3, {R0-R3}
     08001442   STM.W        SP, {R0-R3}
     08001446   ADD          R4, SP, #0x30
     08001448   LDM.W        R4, {R0-R3}
     0800144C   BL           _Z7doStuff10TestStruct 
    }
     08001450   ADD          SP, SP, #0x50
     08001452   POP          {R4-R6, PC}

    Do you know if a fix is in the making, and if I should post a bug report directly to GCC (or are they already aware of this problem)?

Reply
  • Hi Tamar Christina,

    Thanks a lot for the in depth explanation.

    If I understood correctly,  the stack slots are reused and should be reused not matter the number of calls of dostuff( [&]() -> structTest{...}}, but because of the copy of an unnamed variable, GCC does not realize stack is being reused.

    Strangely enough, I cannot reproduce the reuse of the stack slots.

    Here is the assembly I get :

    _Z14wrapper2LAMBDAv
    $Thumb
    {
     08001404   PUSH         {R4-R6, LR}
     08001406   LDR          R4, =_etext            
     08001408   MOV          R6, R4
     0800140A   LDM          R6!, {R0-R3}
    {
     0800140C   SUB          SP, SP, #0x50
     0800140E   ADD          R5, SP, #0x10
     08001410   STM          R5!, {R0-R3}
     08001412   LDM.W        R6, {R0-R3}
     08001416   STM.W        R5, {R0-R3}
     0800141A   ADD          R3, SP, #0x20
     0800141C   LDM          R3, {R0-R3}
     0800141E   STM.W        SP, {R0-R3}
     08001422   ADD          R5, SP, #0x10
     08001424   LDM.W        R5, {R0-R3}
     08001428   ADDS         R4, #0x20
     0800142A   BL           _Z7doStuff10TestStruct
     0800142E   LDM          R4!, {R0-R3}
     08001430   ADD          R5, SP, #0x30
     08001432   STM          R5!, {R0-R3}
     08001434   LDM.W        R4, {R0-R3}
     08001438   STM.W        R5, {R0-R3}
     0800143C   ADD          R3, SP, #0x50
     0800143E   LDMDB        R3, {R0-R3}
     08001442   STM.W        SP, {R0-R3}
     08001446   ADD          R4, SP, #0x30
     08001448   LDM.W        R4, {R0-R3}
     0800144C   BL           _Z7doStuff10TestStruct 
    }
     08001450   ADD          SP, SP, #0x50
     08001452   POP          {R4-R6, PC}

    Do you know if a fix is in the making, and if I should post a bug report directly to GCC (or are they already aware of this problem)?

Children