This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Armv8.7 extension- FEAT_AFP : FPCR.NEP use case

Hi everyone,

I was working to develop armv8.7 feature FEAT_AFP. There I come across one of the bit enabled in FPCR register i.e., NEP bit - bit[2] of FPCR register (Floating point control register), according to the documentation following is mentioned (attaching link to screen shot).

/resized-image/__size/640x480/__key/communityserver-discussions-components-files/468/fpcr.nep.png

For instance take FMADD:

FMADD (three input scalar version) : Floating-point fused Multiply-Add (scalar)

  • FMADD <Sd>,<Sm> ,<Sn> ,<Sa>
  • Sd= Sm*Sn + Sa
  • if FEAT_AFP is implemented
    • FPCR.NEP=0, no affect
    • FPCR.NEP=1, output other than lowest= Sa

Here, as per the documentation upper bits of Sd will be populated by upper bits of Sa. If we take an example of 32 bit precision, then upper (128-32)=96 bits of Vd register will be populated by upper 96 bits of Va. This is I have verified using Trace32 debugger tool. 

But I cannot find any use case for this. Since, we are directly populating the addend to destination register we cannot say we are increasing the precision in some way. Can anyone please explain the use case of this bit?

Thanks.

Parents
  • ince, we are directly populating the addend to destination register we cannot say we are increasing the precision in some way.

    That only means that the output elements other than the lowest are calculated as Sd[e]= Sm[e] * Sn[e] + Sa[e], where Sm[e] = Sn[e] = 0.0.

    I suppose the equivalent vector/SIMD operation is: output = <0,0,0,m0> * <0,0,0,n0> + <a3,a2,a1,a0>

Reply
  • ince, we are directly populating the addend to destination register we cannot say we are increasing the precision in some way.

    That only means that the output elements other than the lowest are calculated as Sd[e]= Sm[e] * Sn[e] + Sa[e], where Sm[e] = Sn[e] = 0.0.

    I suppose the equivalent vector/SIMD operation is: output = <0,0,0,m0> * <0,0,0,n0> + <a3,a2,a1,a0>

Children