Open standards: Enabling high performance infrastructure systems with CCIX

Authors: Kevin Yee, Cadence, and Jim Wallace, Arm

Unless you’re a hermit or have been living off the grid for the last few years, you can’t help but know about the explosion in big data – the increase in data size and networking bandwidth due to video streaming, social network, IoT, and AI/ML applications. Datacenter operators and service providers are looking for new ways to increase compute resources, improve network performance and agility and are making trade-offs between compute in the cloud versus compute at the edge. Today’s server solutions are dominated by a proprietary compute architecture, meaning, you are locked in.

For years, there have been multiple attempts to incorporate newer and lower power architectures to supplant legacy systems with little success. One of the biggest limitations was the lack of interoperability among these newer systems as there were no open industry standards to rally behind.

To resolve this and to keep up with the real-time data processing needed in today’s hyperconnected world, a consortium called CCIX (Cache Coherent Interconnect for Accelerators) formed in 2017 to enable open systems, where accelerators from different suppliers can interoperate with each other. CCIX focuses on emerging acceleration applications, for example, machine learning, network processing, and storage off-load, all key to our connected edge future.

Figure 1 CCIX system

Figure 1: CCIX system

The driving factors for CCIX are:

  1. Faster chip to chip interconnect,
  2. Cache coherency for faster access to memory, and
  3. Heterogeneous multi-processor / accelerator multi-chip system.

To help accelerate adoption, Arm, Cadence, Xilinx and TSMC joined forces in 2017 to announce the industry’s first CCIX test chip in TSMC 7nm FinFET – a leading technology in a leading process node. This test chip aims to provide a silicon proof point to demonstrate the capabilities of CCIX in enabling multi-core high-performance Arm CPUs working via a coherent CML fabric to off-chip accelerators.

Arm CoreLink CMN-600 coherent multichip links (CML)

CoreLink CMN-600 coherent multichip links and extends the Arm CMN-600 interconnect fabric to support CCIX. This provides the capability of extending the high-frequency, non-blocking AMBA 5 CHI protocol messages across multiple SoCs, enabling system designers to attach more compute or acceleration with a shared virtual memory. Essentially, you are taking the Arm CMN-600 bus and coherency protocol, packaging it up in the CCIX protocol, and now you are leveraging the PCI infrastructure to add that shared memory system across multiple chips.

Figure 2 CMN-600-CML backplane with CCIX interface

Figure 2: CMN-600-CML backplane with CCIX interface

Figure 2 shows a block diagram of a typical infrastructure SoC, including the Cadence CCIX controller and PHY with Arm’s CoreLink CMN-600 Coherent Mesh Network interconnect that provides the data conduit to all key interface IPs. At the edge of this interconnect mesh is a mesh cross point router (XP), which connects to a Cadence PCIe controller IP through the standard Arm AMBA AXI interface. The most common PCIe architecture today uses 16 PCIe Gen 4 lanes operating at 16Gbps. The CCIX controller builds on this, and provides a high performance coherent connection to a variety of different accelerators that can be connected at up to 25Gbs.

Cadence IP and Tools

The Cadence Controller IP for CCIX is based on a silicon proven PCIe solution that is widely deployed in multiple products and supports the 25Gbs performance of CCIX. Their 25Gbps Multi-Link and Multi-Protocol PHY IP implemented in TSMC 7nm is a high-performance SerDes that can operate from 1.25 to 25Gbps and is specifically designed for network and datacenter applications.

Verification

To validate these complex systems, Cadence collaborated with Arm to deliver a complete system-development and hardware/software-verification platform.  By adopting tools such as the Cadence Perspec System Verifier and the right level of abstraction, SoC developers can meet the challenges of validating performance, function and power while avoiding the manual effort and time spent developing complex system-level coverage driven SoC tests.

 Figure 3 Cadence Perspec System Verifier

Figure 3: Cadence Perspec System Verifier

Conclusion

CCIX provides the freedom you need to make the architectural choices and trade-offs in high-performance infrastructure systems. Designers looking for an open high-performance cache-coherent chip to chip accelerator interface with a relatively easy migration path from today’s PCIe should consider CCIX for their next edge to cloud infrastructure SoC.

The Cadence and Arm collaboration in developing, testing and validating the CCIX IP ensures designers get:

  • Faster system integration with a pre-integrated PHY, controller, drivers, and verification IP
  • Reduced design risk with silicon-proven IP
  • Reduced TTM with a complete Cadence-Arm verification solution

More details on CCIX and their latest CCIX Base Specification 1.0 can be found below. 

Visit the CCIX website for more information

Author biography: Kevin Yee is the technical business development director at Cadence responsible for driving worldwide IP enablement to develop an SoC ecosystem with strategic partners. Prior to Cadence, Kevin held roles as VP of sales, marketing and business development in various IP and VIP companies. Having more than 25 years in the semiconductor industry, he has served a variety of senior management roles in R&D engineering, product planning, sales, marketing and business development in system, semiconductor, FGPA, IP/VIP and EDA companies. Kevin holds a Bachelor of Science in electrical engineering from the University of California.

Anonymous