A C function that takes signed-1.31 fixed-point numbers (given as int32_t type) converts the inputs to float type:
#include <stdint.h>float Fixed_to_FP(int32_t x) { return ((float) (x) / 0x1p31f);}
When compiling the code using armclang with the following build options:
-march=armv7+fp -mfloat-abi=hard -mfpu=fpv4-sp-d16 -std=c11 --target=arm-arm-none-eabi -O3 -ffast-math -mcpu=cortex-m4 -mthumb
the tool emits two instructions for the conversion operation:
Fixed_to_FP: vmov s0, r0 vldr s2, .LCPI0_0 vcvt.f32.s32 s0, s0 ; (1) Convert from int32_t to float vmul.f32 s0, s0, s2 ; (2) Scale down by 2^31 bx lr .LCPI0_0: .long 0x30000000 ; = 2^31
However, the VCVT instruction has an optional #fbits parameter, which in this case could be used to spare the vmul instruction:
vcvt.f32.s32 s0, s0, 31 ; Convert from S1.31 to float
How can I get the compiler to emit this efficient instruction?
Thanks!
Thanks, Stephen!
Yes, I am trying to avoid inline assembly, as I am not sure this will be acceptable in our project.
One interesting thing to note is that it is not armclang that is not able to generate this code, but specifically when built for ARMv7 arch. Apparently, when building for ARMv8, the optimizer emits the right instruction.
For an example, you can follow the link in the answer to my question in Stack Overflow:
stackoverflow.com/.../274579