This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

VPf vector example

Note: This was originally posted on 5th January 2011 at http://forums.arm.com

Hi.

the cortex documentation speak about register bank for vector usage.
That's great, but I do not really understand what is a vector instruction (using those bank)

Does anybody can give me an example using the register bank and vpf vector instruction ?

Thank's
  • Note: This was originally posted on 7th January 2011 at http://forums.arm.com


    That is the older, deprecated "VFP vector" or "short vector" mode.  It's not supported (in hardware) on the Cortex-A nor Cortex-R processors (as far as I know).



    I'm agree with you.
    But this is the only usage I found for the bank register.

    NEON do not use the bank registrer.
    Most of NEON instruction can use all NEON registrer without any restriction.

  • Note: This was originally posted on 19th January 2011 at http://forums.arm.com


    I'm agree with you.
    But this is the only usage I found for the bank register.

    NEON do not use the bank registrer.
    Most of NEON instruction can use all NEON registrer without any restriction.




    With the VFP architecture, the VFP registers were divided into 4 banks. So you had:

    Bank #0: S0-S7 and D0-D3
    Bank #1: S8-S15 and D4-D7
    Bank #2: S16-S23 and D8-D11
    Bank #3: S24-S31 and D12-D15

    When you set the VFP vector arity to greater than 1, Banks #1 through #3 were used for vector operations, while Bank #0 was reserved for scalar operations. That way, even if you had set the vector arity to say 4 and were performing operations on 32-bit floating point 4-vectors, you still could use registers S0-S7 for scalar 32-bit floating point operations without having to switch the VFP unit's vector arity back to 1.

    Of course, now that VFP is deprecated on ARMv7 based architectures such as Cortex, NEON is the way to go. VFP instructions with the arity set above 1 on ARMv7/NEON processors will perform much slower, so you should avoid using them on those platforms and use the NEON pipeline instead.

    Edit: There was another interesting property of register addressing when used in vector operations that I had forgotten to mention. The subsequent registers comprising a vector would wrap around on the register bank boundaries. So if you issued the following instruction when the vector arity was set to 4:

    fadds s16, s14, s20

    The first vector operand starting at register s14 would wrap around so that it would be {s14, s15, s8, s9}. It would be the equivalent of:

    s16 = s14 + s20
    s17 = s15 + s21
    s18 = s8 + s22
    s19 = s9 + s23

    You could exploit this trick to perform shuffling of vector components without additional instructions.
  • Note: This was originally posted on 5th January 2011 at http://forums.arm.com

    In archtecture v7-A processors that have the feature (for example Cortex-A5, -A8, -A9, -A15), you want to use the Advanced SIMD (also known as NEON) instructions, e.g. VLD1.16/VADD.I16/VST1.16.  They use a register bank where the registers are named D0-D31 (overlapped with Q0-Q15) that is separate from the integer registers R0-R14.

    You can find some examples in various thread in this forum, for example: http://forums.arm.co...th-arm-or-neon/.

    It's a bit confusing because the VFP instructions (which are older than NEON) use the same D0-D31* register bank and could originally do short vector operations.  But the short vectors were somewhat difficult to use and the feature was not used much, if at all.  In fact, the most recent implementations of the the VFP instructions no longer perform short vector operations in hardware.

    • in ARM11 processors with VFP there are only D0-D15.
  • Note: This was originally posted on 6th January 2011 at http://forums.arm.com

    That is the older, deprecated "VFP vector" or "short vector" mode.  It's not supported (in hardware) on the Cortex-A nor Cortex-R processors (as far as I know).

    Details are in the ARM ARM for v5 http://infocenter.arm.com/help/topic/com.arm.doc.ddi0100i/index.html or v7-A & -R http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406b/index.html