This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to generate a VCVT instruction with the #fbits param on CM4?

yaniv.sapir 8 months ago

A C function that takes signed-1.31 fixed-point numbers (given as int32_t type) converts the inputs to float type:

#include <stdint.h>
float Fixed_to_FP(int32_t x) {
    return ((float) (x) / 0x1p31f);
}

When compiling the code using armclang with the following build options:

-march=armv7+fp -mfloat-abi=hard -mfpu=fpv4-sp-d16 -std=c11 --target=arm-arm-none-eabi -O3 -ffast-math -mcpu=cortex-m4 -mthumb

the tool emits two instructions for the conversion operation:

Fixed_to_FP:
        vmov    s0, r0
        vldr    s2, .LCPI0_0
        vcvt.f32.s32    s0, s0      ; (1) Convert from int32_t to float
        vmul.f32        s0, s0, s2  ; (2) Scale down by 2^31
        bx      lr
.LCPI0_0:
        .long   0x30000000          ; = 2^31

However, the VCVT instruction has an optional #fbits parameter, which in this case could be used to spare the vmul instruction:

        vcvt.f32.s32    s0, s0, 31  ; Convert from S1.31 to float

How can I get the compiler to emit this efficient instruction?

Thanks!

Parents

0 Stephen Theobald

8 months ago

My name is Stephen and I work for Arm.

Sorry, armclang is unable to optimize that function in pure C code, but a workaround is to use inline assembly like this:

#include <stdint.h>
float Fixed_to_FP(int32_t x) {
  float f;
  __asm volatile ("vmov %[flt], %[integer]\n"
                  "vcvt.f32.s32 %[flt],%[flt],#31\n"
                  : [flt] "=w" (f) : [integer] "r" (x));
  return f;
}

when compiled with:

armclang -mfloat-abi=hard -mfpu=fpv4-sp-d16 -std=c11 --target=arm-arm-none-eabi -O3 -ffast-math -mcpu=cortex-m4 -mthumb

generates:

Fixed_to_FP:
    vmov    s0, r0
    vcvt.f32.s32    s0, s0, #31
    bx    lr

Some notes on the inline assembler:
You can't cast x to f and pass in f or the compiler will insert a vcvt.
=w is a floating point register that is written to after it has been read so the compiler can assign same register to it.
r is a core register input to the block.

Hope this helps

Stephen

Reply