At Hot Chips 2023 Arm is publicly announing a new product offering – Arm Neoverse Compute Subsystems (CSS) – and the first product, Arm Neoverse CSS N2. When we introduced Arm Neoverse solutions in 2018, there was an immediate and positive market reaction to the Arm Neoverse value proposition - delivering market leadership performance-per-watt CPU efficiency in a platform that can be customized with memory, I/O, or specialized silicon to address any workload market need. Specialized silicon is key to addressing the unrelenting data demand placed on modern IT infrastructure.
Today, with the introduction of Arm Neoverse CSS, we tackle the specialized silicon challenge from a different angle. In our time engaging with Cloud-to-Edge customers, and more recently with silicon development teams targeting AI and ML solutions, we’ve heard a consistent theme. Customers love the scalable efficiency Arm Neoverse platforms deliver, but they don’t want to reinvent the wheel – IP selection, system configuration, floorplanning, verification, validation, 3rd party IP and fab integration – that comes with building a CPU compute subsystem.
This is why we have developed Arm Neoverse CSS. Neoverse CSS comes delivered as customizable compute subsystem that is configured, verified, validated and PPA-optimized by Arm. Arm performs the undifferentiated heavy lifting, enabling Arm partners to build specialized silicon at a lower cost, with less risk and a faster time-to-market. You can read more about the realized customer benefits of Neoverse CSS in this blog from Infrastructure Line of Business GM, Mohamed Awad.
CSS delivers significant customer value beyond existing licensing models.
Figure 1: The customer value of Arm Neoverse CSS compared to the IP licensing method
The remainder of this blog focuses on the technical aspects of Neoverse CSS N2, our first CSS product.
Neoverse CSS N2 leverages Arm Neoverse N2 platform IP. Core count is configurable from 24, to 32, or up to 64 Neoverse N2-cores, with core frequencies of 2.1GHz up to 3.6Ghz, implemented in an advanced 5nm process. Each N2 core supports Armv9 instructions for vector processing and ML, enhanced cryptography, memory partitioning and monitoring, and advanced power management. This lets CSS N2 address a range of market needs from 5G on Arm to DPU to Cloud Computing on Arm and ML.
CSS N2 supports the latest memory and I/O technologies. Customers can implement up to 8x 40b DDR5 or LPDDR5 channels per die, at speeds up to DDR5-5600. CSS N2 supports up to 4x x16 PCIe/CXL combo PHYs and controllers, each with 4-way bifurcation down to 4x x4 lanes.
Table 1: Summary of Arm Neoverse CSS N2 platform capabilities
To address a range of cloud-to-edge use cases, Neoverse CSS N2 includes a broad set of multi-core and multi-chip scaling capabilities. For a use case like scale-out cloud, where a high core count is desired, CSS N2 supports scaling of up to 256-cores across two sockets. High-speed chip-to-chip links, using UCIe or a partner-specific PHY, can link up to 128-cores in a single socket. And two sockets can be coherently connected using CXL PHYs and SMP protocol. In both cases, AMBA CXS protocol is used to bridge between the UCIe/CXL physical and data link layers into the AMBA CHI-based CMN-700 interconnect mesh.
To support the development of specialized silicon and heterogeneous compute, Neoverse CSS N2 provides options for both on-die and externally attached accelerators or other devices. On-chip accelerators can be incorporated using Arm’s NI-700 packetized network-on-chip interconnect with interrupt and address translation support. For off-chip acceleration, CSS N2 supports combo PCIe Gen5/CXL1.1 PHYs, enabling attachment of GPUs, TPUs, DPUs and other high-speed devices. This includes support for CXL Type3 connections - useful for memory expansion, pooling and tiering use cases.
Neoverse CSS N2 includes all of the compute subsystem elements our partners need to build specialized silicon. This includes system control and management, handled via embedded Cortex-M7 processors. The System Control Processor (SCP) is a trusted core controlling all system functions like clock control, and power and voltage domains. The Manageability Control Processor (MCP) interfaces with an external BMC for on-chip management, RAS, event logging, and communication alerts.
Finally, CSS N2 is SystemReady SR certified, and comes with a reference firmware stack and virtual fixed platform model. This allows partners to quickly develop platform firmware, integrate OS and services, and tune boot flows, security, and power management – all before taping out final silicon.
Figure 3: CSS N2 Block Diagram
CSS N2 has been in development for several years and we are excited to announce it to the world. We believe this new way of adopting fully designed, validated, and PPA-optimized Arm IP will deliver major commercial and technical advantages for our partners. Several have already taped-out CSS N2-based silicon and are seeing amazing results. By relieving the burden of developing the compute subsystem portion of their design, partners will be able to focus their resources on the differentiated, specialized computing that is required to meet today’s data challenges while flattening the curve on power consumption and helping build a more sustainable infrastructure future.
Read the CSS N2 Newsroom Blog