This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

I'm not seeing any flush-to-zero (FTZ) effects with NEON intrinsics on an ARM A9, any advice?

Hi everyone,

As the title states - I've had issues reproducing flush-to-zero (FTZ) using the NEON intrinsics provided in the 'arm_neon.h' header. For test purposes I'm using an iPhone 6 with an ARMv8-A dual-core ('Twister') CPU.

In the ARM information center under Home>Neon Programming>Flush-to-zero mode in NEO (ARM Information Center) I see that 'NEON always uses flush-to-zero mode'.

What calculation on NEON produces Flush-to-zero (FTZ) but does not on IEEE 754 floating point compatible processors?

As yet I don't see any difference between IEEE 754 and the results returned by ARM NEON intrinsic operations.

Here is an attempt I made:

  //Allocate some float buffers
    Float32 *inFloatsA = (Float32*)malloc(sizeof(Float32)*4);
    Float32 *inFloatsB = (Float32*)malloc(sizeof(Float32)*4);
    Float32 *outFloats = (Float32*)malloc(sizeof(Float32)*4);
   
    // Initialise input values
    for (int i = 0; i<4; i++)
    {
        inFloatsA[i] = (Float32)-2e-125;
        inFloatsB[i] = (Float32)1e-100;
    }
   
    //Subtract inFloatsB from inFloatsA and store in outFloats
    float32x4_t neonFloatsBufferA = vld1q_f32(&inFloatsA[0]);
    float32x4_t neonFloatsBufferB = vld1q_f32(&inFloatsB[0]);
    float32x4_t result = vsubq_f32(neonFloatsBufferA, neonFloatsBufferB);
    vst1q_f32(&outFloats[0], result);

    //Calculate the expected IEEE 754 value
    Float32 expected = inFloatsA[0] - inFloatsB[0];

    //Test if the IEEE 754 value matches the NEON output
    if (expected != outFloats[0])
    {    
          printf("Got a different value than IEEE 754!\n");
    }

Essentially I never see the log 'Got a different value than IEEE 754!'. Is there a set of initial input values that would create a FTZ effect on NEON?

Am I incorrect in thinking 'inFloatsA[0] - inFloatsB[0]' will use the IEEE 754 standard?

Kind Regards,

David L