In late 2019, Arm announced a new Total Compute strategy. This is a shift in our focus from product evolution to use-case driven solutions, with the first ever Total Compute solutions launched this year. These solutions adopt a holistic system approach that ensures they can seamlessly and securely handle ever more complex and compute intensive workloads and use cases by optimizing performance across the system. A major part of the success of the Total Compute approach is our latest System IP – CoreLink CI-700 Coherent Interconnect and CoreLink NI-700 Network-on-Chip Interconnect.
Arm is a market leader for Interconnect technologies with a strong track record of partner adoption over many years, across market segments from mobile and IoT to infrastructure and enterprise compute. Our Interconnect technologies are the backbone of any SoC system and crucial to delivering system performance improvements. This is reflected in the CoreLink CI-700 and CoreLink NI-700, which work seamlessly with Cortex-X and Cortex-A CPUs, Mali GPUs and Ethos NPUs to enable a series of system enhancements across the SoC.
The three key aims of the Total Compute strategy are enhanced compute performance, security and developer access to more performant software and tools. CoreLink CI-700 and CoreLink NI-700 provide benefits across all three areas. The system improvements provide low latency for enhanced compute performance and high memory bandwidth for more advanced use cases. Both also provide higher security protections across the entire system through the new security architectural features, such as Memory Tagging Extensions (MTE). Finally, the flexible and faster configuration, which delivers a much faster time to market for our partners, is enabled through advanced design and verification tooling.
The first component of our System IP offering is CoreLink CI-700. It is a configurable coherent Interconnect designed together with Arm v9 Cortex processors and the latest Arm technologies to enable fully optimized Total Compute solutions. Each CoreLink CI-700 is scalable across the Total Compute solutions for premium, performance and efficiency tiers. These solutions offer different levels of performance, efficiency and scalability to deliver specialized compute across multiple consumer device markets. The scalability of CoreLink CI-700 means it can support low-power interconnect implementations from 1GHz right up to high-performance implementations up to 2GHz in 5nm processes.
CoreLink CI-700 is designed to meet requirements from a wide range of different use cases and consumer devices. From High Dynamic Range (HDR) and high frame rate video on DTVs right through to AAA gaming on premium mobile devices. Compute intensive applications are supported through CoreLink CI-700’s high-performance AMBA CHI mesh interconnect technology. This allows the coherent Interconnect to support 1-8 coherency clusters over the AMBA CHI interface. This aligns with the new DynamIQ Shared Unit-110 (DSU-110) that binds together different Armv9 CPU cores within a CPU cluster.
Alongside performance, CoreLink CI-700 offers fully coherent, system level cache (SLC) for bandwidth and system power reductions. This reduces the average memory latency and system power due to fewer external memory transactions. It is an exclusive cache, so cache resources add to those in the Armv9 CPU clusters. Moreover, the SLC can be shared with GPUs and other accelerators. Supporting the SLC, Memory Partitioning and Monitoring (MPAM) enables control of how the SLC resources are allocated and increases predictability within the system.
We designed and verified the complete Total Compute solution to see how it responded to key compute workloads. While benchmarking a configuration with an 8MB SLC, we saw a 28 percent bandwidth reduction in the Arm Mali-G710 GPU’s external memory bandwidth. Meanwhile, on system power, we observed a 23 percent DDR power reduction and 8 percent net power reduction using the SLC. This may reduce the cost of the memory system, as it could enable fewer memory channels or the use of lower speed grade (and so lower cost) DDR RAM. It may also reduce the power used by it.
A fundamental pillar of the Arm Total Compute strategy is security. This means incorporating security features that are designed to improve resilience to attacks and stop vulnerabilities at the source before they cause harm. As mentioned in this blog, Arm’s Cortex v9 CPUs have adopted MTE technology, which makes detecting memory safety violations across the entire system far easier and more efficient.
Memory safety bugs are the single largest category of hacker attack vectors. Microsoft has stated that memory safety issues have been the cause of 70 percent of the CVEs (common vulnerabilities and exposures) that had been patched. Meanwhile, Google has announced that it is adopting Arm’s MTE in Android and in a blog post the Google Security team stated they have discovered close to 100 memory safety issues using the technique.
CoreLink CI-700 enhances performance when using MTE through the Armv9 CPU cores, accelerating the technology and its benefits. The MTE tags are held and checked in the SLC along with the data. On external memory accesses they are split into separate data and tag transactions, with a tag cache per memory interface reducing external tag memory accesses. Immediately our silicon partners can address this class of bugs before they provide their SoCs to the OEMs who manufacture the devices. This provides significant cost and time savings.
Beyond the device, MTE provides benefits to system software and applications, with the technology helping developers to find memory safety bugs. This quickens time-to-market, as developers can find memory-related bugs sooner in development and testing. MTE technology increases the stability and robustness of system software and applications not only in the lab but also in production devices in the field.
CoreLink NI-700 is a flexible packetized network-on-chip Interconnect for high-bandwidth accelerators, such as GPUs and NPUs, as well as rest-of-SoC connectivity. Packetization reduces wiring by 30 percent easing physical design. The Network-on-Chip (NoC) Interconnect also adopts the latest Arm architecture features and AMBA interface standards. This improves performance, reliability, and virtualization. Moreover, the advanced tooling support enables faster design, configuration, and implementation of complex SoCs for improved system performance and reduced routing congestion and area.
CoreLink NI-700 is also highly configurable and scalable across different use cases and devices. It not only targets consumer and mobile devices, but can also be implemented across SoC solutions targeting markets ranging from premium IoT devices to Enterprise compute.
A new capability that CoreLink NI-700 introduces is Integrated Device Management (IDM). IDM detects a peripheral causing a timeout, isolates it from the rest of the system, before stabilizing the system by completing the AMBA transaction (if incomplete). Finally, a software handler can recover by, for example, soft-resetting the device or powering it up if it was unpowered. This increases the uptime by overcoming issues without rebooting the entire device. This could significantly reduce how often a user needs to reboot their Wi-Fi router or set-top box, for example. CoreLink NI-700 also maintains the Quality of Service (QoS) features from previous Arm Interconnect products. The QoS provides virtual channels for non-blocking arbitration and reduced wiring as well as QoS regulators, which achieves bandwidth and latency targets across the system.
CoreLink CI-700 and CoreLink NI-700 support advanced design and verification tooling. These simplify the implementation and provide a quicker time-to-market for partners, as well as better results. The tools enable the quicker configuration of Arm IP within a system.
CoreLink NI-700 and CoreLink CI-700 are configured with the Socrates tool. Through this tool, engineers can select, configure, and build IP blocks, and then place within a floorplan. This reduces congestion and spreads the NoC over the SoC to aid timing closure. Within the Socrates tool, NoC-SE is an AI-enabled NoC behavioral synthesis engine. The tool auto-generates an optimized NoC from interface, traffic, and clock and power domain specifications, as well as floorplan. This explores many potential NoC solutions. Finally, the AMBA Viz protocol visualization tool allows designers to: easily visualize simulations; piece together and track complex coherent transaction sequences; view the Interconnect topology; measure latency and bandwidth; decode Interconnect packet fields; and, measure clock cycles. This aids the identification of issues found during design verification.
Our Interconnect technologies are vital components of our new Total Compute Solutions. Both CoreLink CI-700 and CoreLink NI-700 are highly configurable IP designed to enable the very best solution performance. These improvements and power and bandwidth reductions across the system optimize key solution level use cases, such as AAA gaming. Moreover, the Interconnect technologies offer greater security protections through accelerating MTE hardware support and comprehensive design and verification tooling to speed up the SoC implementation process. This creates a seamless system, with the market-proven Interconnect designed and validated together with the latest Armv9 CPU cores. In the future, we will continue to invest in the very best Interconnect technologies, bringing Arm’s Total Compute vision of seamless and secure performance for tomorrow’s compute to life.
Learn more about CoreLink CI-700
Learn more about CoreLink NI-700
Learn more about the Total Compute Solutions