This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

HI,why the VFP vector mode can not be used in cortex-a series processors?

HI,why the VFP vector mode can not be used in cortex-a series processors?

  • I kown NEON, but the VFP vector mode can processes eight single-float operations in a instructions,more than NEON, so  why removed it?

    meanwhile, today i configure the FRSCR register on the cortex-A9 processor, the VFP vector mode can work, but it became slower than one by one, why appear this situation?

  • why VFP instructions with the arity set above 1 on ARMv7/NEON processors will perform much slower?

  • > so why removed it?

    In general the ARM instruction set has been getting simpler and simpler over time. This is for two reasons:

    1. Weird instructions and complex modes of operation are generally difficult to use when generating code in compilers, which forces developers to hand-code assembler to get best use out of them.
    2. In general it is more efficient in modern high-frequency cores to run multiple single cycle instructions than cope with the hardware complexities introduced by complex multi-cycle instructions.

    It's worth noting that the VFP vector mode was still scalar processing - i.e. the maths unit could just execute one vector entry every clock cycle, and multiple lanes took multiple cycles. NEON is therefore faster; i.e. a vec8 fp32 operation would take two NEON instructions, rather than one VFP vector strided instruction, but on most implementations would take two cycles rather than the eight the VFP operation would take. NEON also has vector data load support, which provides another efficiency improvement.

    HTH,
    Pete

  • Support for arity > 1 is deprecated, and provided only for backwards compatibility with legacy code bases. It is known to be relatively slow. Either use the new NEON instructions, or write scalar VFP sequences with arity=1.

  • The short vector mode for VFP was removed in ARMv7 architecture, so it simply doesn't exist in Cortex-A (or other ARMv7 processors). However note that you gain the NEON SIMD instruction set, which is a true vector processing ISA.

    Cheers,
    Pete

  • i still do not understand the reason why became slower int cortex-a processors using VFP vector mode,can you explain it in more details ?

  • There isn't really any more detail that will be useful; it's deprecated, slower on ARMv7, and totally removed in ARMv8.  There are at least two fast non-deprecated alternatives; write non-vector VFP code with multiple instructions or use NEON.

    HTH,
    Pete