Unlock the power of SVE and SME with SIMD Loops

September 19, 2025

1 minute read time.

Writing high-performance software for Arm often means diving deep into its SIMD technologies. Many developers know NEON, the fixed-width vector extension, but Arm’s latest SVE (Scalable Vector Extension) and SME (Scalable Matrix Extension) take things further.

They are not just wider vectors. They introduce new concepts such as predication, scalable vectors, streaming modes, and matrix tiles. These features offer unprecedented flexibility. However, with this power comes complexity.

That is where SIMD Loops steps in.

SIMD Loops is an open-source project designed to help developers learn SVE and SME through hands-on experimentation. It provides dozens of real-world loop kernels. Examples include matrix multiplication, vector reduction, sorting, and string processing. Each kernel is written in C, Arm intrinsics, and inline assembly.

Each loop is carefully annotated to showcase key architectural features in action. This lets you see exactly how instructions like fmopa or fmla work in practice.

Unlike a recipe book, SIMD Loops does not just hand you solutions. It helps you understand the architecture itself. You will see how different vector instruction sets (for example, NEON, SVE, SME, SVE2, SME2.1) handle the same kernel, compare performance, and gain a foundation for writing your own high-performance code.

Whether you are moving from NEON or starting fresh with SVE/SME, SIMD Loops offers a clear, practical pathway to mastering Arm’s most advanced SIMD technologies.

Ready to dive in?

Learn more and explore practical examples in our guided Learning Path.

SIMD Loops Learning Path

Mobile, Graphics, and Gaming blog

Unlock the power of SVE and SME with SIMD Loops

Vidya Praveen

SIMD Loops is an open-source project designed to help developers learn SVE and SME through hands-on experimentation. It offers a clear, practical pathway to mastering Arm’s most advanced SIMD technologies…
- September 19, 2025
What is Arm Performance Studio?

Jai Schrem

Arm Performance Studio gives developers free tools to analyze performance, debug graphics, and optimize apps on Arm platforms.
- August 27, 2025
How Neural Super Sampling works: Architecture, training, and inference

Liam O'Neil

A deep dive into a practical, ML-powered approach to temporal super sampling.
- August 12, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Unlock the power of SVE and SME with SIMD Loops

Ready to dive in?

Unlock the power of SVE and SME with SIMD Loops

What is Arm Performance Studio?

How Neural Super Sampling works: Architecture, training, and inference