We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi everyone,
As the title states - I've had issues reproducing flush-to-zero (FTZ) using the NEON intrinsics provided in the 'arm_neon.h' header. For test purposes I'm using an iPhone 6 with an ARMv8-A dual-core ('Twister') CPU.
In the ARM information center under Home>Neon Programming>Flush-to-zero mode in NEO (ARM Information Center) I see that 'NEON always uses flush-to-zero mode'.
What calculation on NEON produces Flush-to-zero (FTZ) but does not on IEEE 754 floating point compatible processors?
As yet I don't see any difference between IEEE 754 and the results returned by ARM NEON intrinsic operations.
Here is an attempt I made:
//Allocate some float buffers Float32 *inFloatsA = (Float32*)malloc(sizeof(Float32)*4); Float32 *inFloatsB = (Float32*)malloc(sizeof(Float32)*4); Float32 *outFloats = (Float32*)malloc(sizeof(Float32)*4); // Initialise input values for (int i = 0; i<4; i++) { inFloatsA[i] = (Float32)-2e-125; inFloatsB[i] = (Float32)1e-100; } //Subtract inFloatsB from inFloatsA and store in outFloats float32x4_t neonFloatsBufferA = vld1q_f32(&inFloatsA[0]); float32x4_t neonFloatsBufferB = vld1q_f32(&inFloatsB[0]); float32x4_t result = vsubq_f32(neonFloatsBufferA, neonFloatsBufferB); vst1q_f32(&outFloats[0], result); //Calculate the expected IEEE 754 value Float32 expected = inFloatsA[0] - inFloatsB[0]; //Test if the IEEE 754 value matches the NEON output if (expected != outFloats[0]) { printf("Got a different value than IEEE 754!\n"); }
Essentially I never see the log 'Got a different value than IEEE 754!'. Is there a set of initial input values that would create a FTZ effect on NEON?
Am I incorrect in thinking 'inFloatsA[0] - inFloatsB[0]' will use the IEEE 754 standard?
Kind Regards,
David L
David,
In A64 (unlike A32) the Advanced-SIMD/Neon can support both FtZ and full denormal operation, controlled by the same bit that determines the regular FP operation mode.
Simon.