This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Optimizer removes ssub16 used to set GE flags

Note: This was originally posted on 27th April 2009 at http://forums.arm.com

I'm using RVCT 3.0 compiler with the optimizer (e.g., -O3). My code has inline assembly.

Here's my problem: If the inline assembly uses a parallel subtract instruction (e.g., SSUB16) to set the GE flags for use by the SEL instruction, the optimizer removes the SSUB16.

It seems that the optimizer removes the SSUB16 because it doesn't see the register result being used, even though the GE results are indeed used.

Here's an example:
__inline int MAX16(int a1_a0, int b1_b0)
{
   register int maxVal16;
   __asm
   {
      ssub16  maxVal16, a1_a0, b1_b0
      sel     maxVal16, a1_a0, b1_b0
   };
   return(maxVal16);
}
int findMax(int a, int b )
{
   return (MAX16(a, b ));
}


The dump file has
    findMax
    $a
    .text
        0x00000000:    e6800fb1    ....    SEL      r0,r0,r1
        0x00000004:    e12fff1e    ../.    BX       lr



Is there a way to prevent this problem without adding cycles?   I can fix the problem by using a volatile ptr for the ssub16 result, but that adds extra cycles.

Thanks in advance for any help.
Parents
  • Note: This was originally posted on 28th April 2009 at http://forums.arm.com

    Interesting! Looks like a compiler bug to me. Are you able to try a newer version of RVCT?

    You could work around the problem by using an embedded assembler function, rather than inline assembler, but I don't think you can inline them. For example:
    __asm int MAX16(int a1_a0, int b1_b0)
    {
    mov  r2, r0
    ssub16  r0, r2, r1
    sel  r0, r2, r1
    bx lr
    }


    Also, GCC allows you to use the volatile keyword to indicate that an asm block should not be optimized. I'm not sure if RVCT provides that or not, but if it does it would solve your problem. Try doing "__asm volatile" in place of just "__asm" in your code.

    Thanks,
    Jacob
Reply
  • Note: This was originally posted on 28th April 2009 at http://forums.arm.com

    Interesting! Looks like a compiler bug to me. Are you able to try a newer version of RVCT?

    You could work around the problem by using an embedded assembler function, rather than inline assembler, but I don't think you can inline them. For example:
    __asm int MAX16(int a1_a0, int b1_b0)
    {
    mov  r2, r0
    ssub16  r0, r2, r1
    sel  r0, r2, r1
    bx lr
    }


    Also, GCC allows you to use the volatile keyword to indicate that an asm block should not be optimized. I'm not sure if RVCT provides that or not, but if it does it would solve your problem. Try doing "__asm volatile" in place of just "__asm" in your code.

    Thanks,
    Jacob
Children
No data