This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

What is the default FPU in ARMv8?

I have a couple of question about the default FPU in ARMv8. In some places is implied that the FPU and the SIMD engine is same unit, implying that the FPU and the NEON share the same pipeline execution state(s). But in other places is implied that NEON and VFP are different units: NEON being the AdvSIMD engine and VFP the FPU. Like the ARM website for Cortex-A53, that implies they are separated units (looking at the image) and hints that VFPv4 is the Floating-Point Unit. 

Some people say that the AdvSIMD is the FPU for AArch64 and VFPv3 is the FPU for AArch32, but I have my doubts. 

For instance, if I run a FP instruction, like FADD:

FADD S0, S1, S2

In which unit will it be executed? VFP or NEON/AdvSIMD? 

I searched for some days, and don't got much luck., so I would me honored and grateful if someone would help me.

Parents
  • In some places is implied that the FPU and the SIMD engine is same unit, implying that the FPU and the NEON share the same pipeline execution state(s). But in other places is implied that NEON and VFP are different units: NEON being the AdvSIMD engine and VFP the FPU.

    Armv8-A is an architecture.  An architecture describes function of a processor (e.g. what instructions exist, what they do, how they are represented in memory).  Architecture does not describe the design of a processor (e.g. the pipeline), Arm refers to that as micro-architecture.  

    Why that matters is that your question is about whether FPU and SIMD instructions are handled in one unit or multiple.  That's a micro-architecture question, not an architecture question.  Any given two Armv8-A compliant designs might make different design trade offs.  You could answer the question specifically for (say) the Cortex-A53, but the answer might be different for the Cortex-A57.  

    For instance, if I run a FP instruction, like FADD:

    FADD S0, S1, S2

    In which unit will it be executed? VFP or NEON/AdvSIMD? 

    Your example instruction is a scaler 32-bit floating point addition.  To be a SIMD instruction it would need to use "v" registers instead of "s" registers. 

    But that doesn't tell you exactly which pipe a given processor design will choose to execute it in.

    Can I ask, what problem are you trying to solve?

Reply
  • In some places is implied that the FPU and the SIMD engine is same unit, implying that the FPU and the NEON share the same pipeline execution state(s). But in other places is implied that NEON and VFP are different units: NEON being the AdvSIMD engine and VFP the FPU.

    Armv8-A is an architecture.  An architecture describes function of a processor (e.g. what instructions exist, what they do, how they are represented in memory).  Architecture does not describe the design of a processor (e.g. the pipeline), Arm refers to that as micro-architecture.  

    Why that matters is that your question is about whether FPU and SIMD instructions are handled in one unit or multiple.  That's a micro-architecture question, not an architecture question.  Any given two Armv8-A compliant designs might make different design trade offs.  You could answer the question specifically for (say) the Cortex-A53, but the answer might be different for the Cortex-A57.  

    For instance, if I run a FP instruction, like FADD:

    FADD S0, S1, S2

    In which unit will it be executed? VFP or NEON/AdvSIMD? 

    Your example instruction is a scaler 32-bit floating point addition.  To be a SIMD instruction it would need to use "v" registers instead of "s" registers. 

    But that doesn't tell you exactly which pipe a given processor design will choose to execute it in.

    Can I ask, what problem are you trying to solve?

Children