This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

NEON-Advanced SIMD vs. SIMD

Hello,

I’m new to ARM architecture and was looking to get a better understanding of how it works. Most notably, the Cortex-A series and its DSP functionality.

When reading through ARM’s webpage, it often refers to “NEON-Advanced SIMD”, “NEON”, and “SIMD” capabilities. Looking at the ARM processor architecture, it looks like SIMD is applied to ARMv6 architecture and NEON-Advanced SIMD is applied to ARMv7 architecture. From what I understand, NEON is an improved version of SIMD instruction used in ARMv6. If that is true, does this mean “NEON-Advanced SIMD” includes all the features of the previous SIMD included in ARMv6, such as the SIMD and DSP extensions? Or is it a different entity in its own?

Regards,

Kenrick

Parents
  • If you do a google of 'ARM A7 pipeline' and click the images tab for instance you'll see that NEON and floating point are handled by the same pipeline. The NEON registers are the same as the floating point registers.

    The ARMv7 architecture only says how things should appear to the outside world . It doesn't say how things will actually be implemented. It makes sense to separate the NEON and floating point from the arithmetic on the general registers.However if some designer made a good case for doing for instance division using a single shared unit in a particular processor design then that is very possibly what would be done.

Reply
  • If you do a google of 'ARM A7 pipeline' and click the images tab for instance you'll see that NEON and floating point are handled by the same pipeline. The NEON registers are the same as the floating point registers.

    The ARMv7 architecture only says how things should appear to the outside world . It doesn't say how things will actually be implemented. It makes sense to separate the NEON and floating point from the arithmetic on the general registers.However if some designer made a good case for doing for instance division using a single shared unit in a particular processor design then that is very possibly what would be done.

Children
No data