Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Servers and Cloud Computing blog Redefining storage with Arm Cortex-R82 and Neoverse CMN-S3
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Cortex-R82
  • System on Chip (SoC)
  • Neoverse
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Redefining storage with Arm Cortex-R82 and Neoverse CMN-S3

John Xavier Lionel
John Xavier Lionel
September 30, 2025
4 minute read time.

The changing face of storage

In today’s digital economy, storage is about more than capacity. Modern applications, such as cloud computing, AI/ML, edge analytics, and 5G services demand low latency, high throughput, security, and scalability. Storage devices must act not only as repositories but as active participants in the data pipeline. This shift is redefining what storage architectures need to deliver.
Arm’s Cortex-R82 processor and Neoverse CMN-S3 interconnect are a powerful combination for building the next generation of storage systems. Together, they address deterministic control paths and the massive data movement challenges that define storage workloads today.

Cortex-R82 and Neoverse CMN-S3: The core of modern storage

At the heart of modern SSDs and storage devices is the controller. Increasingly, the interconnect fabric ties the controller to memory and accelerators. The Cortex-R82 delivers the performance and determinism storage demands. Neoverse CMN-S3 ensures that data can move seamlessly across the SoC.

Cortex-R82 brings:

  • 64-bit execution (Armv8-R AArch64), with support for up to 2 TB of memory for large flash arrays and caching DRAM.
  • Optional MMU for dual-mode operation: bare-metal for deterministic I/O, or Linux for computational storage workloads.
  • Advanced memory subsystem with private L1 caches, an optional shared L2 cache up to 4 MB, and tightly coupled memories (up to 1 MB each per core) for ultra-low latency paths.
  • Up to 8 cores per cluster, enabling SoC designers to build multi-core real-time compute islands.
  • Optional support for AMBA CHI, enabling Cortex-R82 clusters to connect directly into CMN-S3 fabrics for coherent, high-bandwidth communication.
  • Full ECC protection (SECDED/DED) across caches and memories for enterprise-grade reliability.
  • Trace and debug features for profiling, QoS enforcement, and system validation.

Cortex-R82

CMN-S3 complements this by:

  • Delivering low-latency, high-bandwidth links between cores, accelerators, and memory controllers.
  • Maintaining system-wide coherency, so CPUs, encryption engines, and compression blocks can share data seamlessly.
  • AMBA CHI protocol compliance, which makes it the natural fabric to tie together multiple Cortex-R82 clusters (up to 8 cores per cluster) into a single coherent system alongside accelerators, memory controllers and I/O.
  • Supporting CHI-C2C & CXL, enabling memory expansion and pooling across servers.
  • Embedding RAS and security features, to ensure data integrity at hyperscale.

Together, Cortex-R82 and CMN-S3 provide both the deterministic control and the scalable data movement needed for modern storage architectures. These range from SSD controllers to high-bandwidth flash memory modules and multi-cluster storage SoCs.

Demonstrating the advantage

The impact of this combination is clear in both compute and memory benchmarks.

  • CPU efficiency: Cortex-R82 achieves 3.71 DMIPS/MHz and 6.28 CoreMark/MHz. This delivers around a 48% uplift in DMIPS and 36% uplift in CoreMark compared to the previous generation Cortex-R8. This uplift gives storage controllers headroom to manage complex I/O pipelines and support Linux-based services.
  • Memory throughput: On the STREAM 256KB benchmark, Cortex-R82 shows up to 4x higher sustained bandwidth across copy, scale, sum, and triad kernels. With CMN-S3 providing coherency and efficient data sharing, this bandwidth uplift accelerates data transfers between caches, flash, and host interfaces. These gains are critical for next-generation high-bandwidth flash modules.

STREAM 256K benchmark showing Cortex-R82 performance

  • Latency-sensitive operations: LMbench results show strong memory copy, zeroing, and streaming performance, with bandwidths exceeding 12 GB/s. These results are particularly relevant for storage workloads like garbage collection, wear leveling, and metadata updates. In these cases, R82’s determinism and CMN-S3’s fabric efficiency work hand in hand.

Cortex-R82 LM-Bench Score

Why Cortex-R82 + CMN-S3 are perfect for storage

Individually, each IP block is powerful. Together, they address the two key challenges of control-plane determinism and data-plane scalability:

  1. SSD Controllers
    • Cortex-R82 provides predictable latency for NVMe command handling and wear-leveling.
    • • CMN-S3 ensures inline accelerators for compression, deduplication, or encryption remain coherent with minimal overhead.
  2. Computational Storage Devices
    • Cortex-R82 runs real-time control tasks alongside Linux applications.
    • CMN-S3 links CPUs with AI/ML accelerators for near-data processing. This reduces data movement bottlenecks.
  3. Hyperscale & Distributed Storage Systems
    • Cortex-R82 delivers deterministic execution for control-plane tasks such as transaction processing, flash translation, and error management. These capabilities are essential for scaling storage across thousands of nodes.
    • CMN-S3 integrates multiple Cortex-R82 clusters (up to 8 cores per cluster) over the AMBA CHI protocol. It provides the coherent mesh backbone needed for hyperscale deployments.
    • Together, they enable multi-cluster, multi-die storage architectures. These support advanced services such as erasure coding, replication, tiered caching, and disaggregated storage.
  4. CXL Memory Pooling Devices
    • With CMN-S3’s CXL support, Arm partners are already building memory pooling solutions that allow memory to be treated as a shared resource across servers. This unlocks new levels of efficiency for storage architectures.

Looking ahead

Storage is no longer passive. From NVMe SSDs to computational storage devices and datacentre storage nodes, the requirements are converging:

  • Deterministic performance to meet strict I/O SLAs.
  • Scalability to handle exponential data growth.
  • Security and reliability to ensure trust at scale.

The Arm Cortex-R82 processor and Neoverse CMN-S3 interconnect provide the building blocks for storage solutions that are predictable, scalable, efficient, and future-proof.
Arm enables partners to design differentiated controllers, multi-cluster high-bandwidth flash memory modules, CXL-enabled memory pooling devices, and advanced storage nodes. This powers the transformation of storage into an active, intelligent part of the compute fabric.

Partner perspective

As one of our customers summarized it best:

“Kioxia embraced Cortex R82 and Neoverse CMN S3 for the prototype of a large-capacity, high-bandwidth flash memory module. They determined that Cortex R82, a 64-bit real-time processor, is optimal for the real-time processing required to control the large-capacity memory. In addition, the high bandwidth of CMN S3 is essential for achieving high-bandwidth memory. By implementing a controller with these IPs and prototyping a memory module, Kioxia successfully demonstrated that both a large capacity of 5TB and a high bandwidth of 64GB/s can be achieved.”

Anonymous
Servers and Cloud Computing blog
  • Redefining storage with Arm Cortex-R82 and Neoverse CMN-S3

    John Xavier Lionel
    John Xavier Lionel
    Explore how Cortex-R82 and CMN-S3 enable secure, reliable, and scalable storage architectures for the future.
    • September 30, 2025
  • Advancing Chiplet Innovation for Data Centers: Novatek’s CSS N2 SoC in Arm Total Design

    Marc Meunier
    Marc Meunier
    Novatek’s CSS N2 SoC, built with Arm Total Design, drives AI, cloud, and automotive innovation with chiplet-based, scalable compute.
    • September 24, 2025
  • How we cut LLM inference costs by 35% migrating to Arm-Based AWS Graviton

    Cornelius Maroa
    Cornelius Maroa
    The monthly wake-up call. Learn how Arm-based Graviton3 reduced costs 40%, cut power use 23%, and unlocked faster, greener AI at scale.
    • September 24, 2025