How to generate a VCVT instruction with the #fbits param on CM4?

yaniv.sapir 1 month ago

A C function that takes signed-1.31 fixed-point numbers (given as int32_t type) converts the inputs to float type:

#include <stdint.h>
float Fixed_to_FP(int32_t x) {
    return ((float) (x) / 0x1p31f);
}

When compiling the code using armclang with the following build options:

-march=armv7+fp -mfloat-abi=hard -mfpu=fpv4-sp-d16 -std=c11 --target=arm-arm-none-eabi -O3 -ffast-math -mcpu=cortex-m4 -mthumb

the tool emits two instructions for the conversion operation:

Fixed_to_FP:
        vmov    s0, r0
        vldr    s2, .LCPI0_0
        vcvt.f32.s32    s0, s0      ; (1) Convert from int32_t to float
        vmul.f32        s0, s0, s2  ; (2) Scale down by 2^31
        bx      lr
.LCPI0_0:
        .long   0x30000000          ; = 2^31

However, the VCVT instruction has an optional #fbits parameter, which in this case could be used to spare the vmul instruction:

        vcvt.f32.s32    s0, s0, 31  ; Convert from S1.31 to float

How can I get the compiler to emit this efficient instruction?

Thanks!