The amount of data that consumers are producing is increasing at a phenomenal rate, and shows no signs of slowing down anytime soon. Cisco estimated last year that global mobile data traffic would multiply tenfold between 2014 and 2019, up to a rate of 24.3 Exabyte per month. In order to support this continuing evolution, the infrastructure backbone of the cloud needs to stay ahead of the curve.
Cloud and server infrastructure needs to stay ahead of predicted usage trends
This is requiring large volume deployments of servers in the “cloud provider” space. Large Internet companies are building out datacenters at a scale that is unprecedented to manage all of this data. There is an insatiable appetite for more compute, with the caveat of it needing to be delivered at highest compute density within given server constraints to minimize Total Cost of Ownership (TCO). Datacenters are replacing servers on a shorter cycle as well as evaluating new installations more often because workflow demands are constantly changing. There is huge opportunity for server SoC vendors to innovate, with some aspects being critical to successfully building a server SoC:
The explosion in computing applications brings opportunity for tailored server SoCs
To help more ecosystem partners enter the market, ARM has designed processors and System IP blocks (e.g. CoreLink™ Cache Coherent Network, Memory Controllers) that can meet the performance and reliability expectations for the mainstream server market. This helps our partners to develop SoCs for their target applications, which in turn enables OEMs and Datacenter providers to get the right performance within budget. ARM has now taken this a step further in enabling our silicon partners to deliver a mainstream server SoC by developing and delivering a server subsystem.
A server subsystem is a collection of ARM IP (processors and System IP) that has been architected and integrated together along with all the glue logic necessary to lay down the foundation for a server class SoC. The subsystem configuration specifically targets mainstream requirements, which covers roughly 80% of the server market.The subsystem allows a partner to quickly go from Register Transfer Level (RTL) which is a high-level hardware description language used for defining digital circuits to silicon. Even with ARM delivering the soft IP blocks to our partners, it can still take multiple months to architect the SoC, integrate all the IP correctly together while meeting power and performance targets. In addition, the design then needs to be fully verified and validated prior to taping out. The ARM subsystem helps “short circuit” this process by delivering verified and validated top level RTL that has already integrated the various ARM IP blocks together in a mainstreamserver configuration. (Find out more about ARM’s system validation process in the whitepaper System Validation at ARM: Enabling our Partners to Build Better Systems). This can save our partners up to a year of effort. The silicon partner can take our subsystem and then add in the necessary Input/Output logic (e.g. PCIe, SATA, and USB) along with any of their own differentiating IP to complete the SoC design. By providing partners with the subsystem, it significantly reduces the effort of integrating, verifying and validating the IP together for this configuration thus reducing overall development time and allows silicon partners to focus resources on differentiation.
So, how does this server subsystem help our partners build a competitive server SoC with faster time to market? ARM has architected the server subsystem to provide enough CPU compute to allow partners to efficiently manage the majority of server workloads. The subsystem consists of up to 48 cores, 12 Cortex®-A72 processor clusters, each with four CPU cores, attached to the CoreLink CCN-512 Cache Coherent Network along with four server class memory controllers (CoreLink DMC-520). Other ARM System IP has been integrated in to perform specialized tasks within the subsystem for the kind of use cases expected. CoreLink NIC-450 Network Interconnect for low power, low latency rest of SoC interconnect for peripheral inputs such as PCIe CoreLink GIC-500 Generic Interrupt Controller performs critical tasks of interrupt management, prioritization and routing supporting virtualization and boosting processor efficiency. The real value of the subsystem lies in the fact that all of the IP has been pre-integrated and pre-validated with ARM engineering “know-how” of our IP to ensure predictable performance with much less engineering resource or time required. By taking a holistic view to system performance, the integration teams were able to make the whole subsystem greater than the sum of its parts.
The picture below shows a high level view of the subsystem.
In addition to the above, ARM provides a system analysis report along with the pre-configured and optimized RTL. The system analysis report gives the silicon partner data we collected on the performance, power, and area of the subsystem. It includes industry standard benchmark emulation results such as SPECCPU 2006, STREAM, and LMBench. Based on early analysis, expect this subsystem to scale to performance levels needed to win mainstream server deployments in large datacenters.
These benchmarks are key data points that an end customer buying a hardware platform based on the SoC leveraging the subsystem uses to decide what platform they will buy and deploy in their datacenter. It is critical that our silicon partners have a good understanding of performance expectations well before they have actual silicon they can test. The investment to develop server SoCs is high and reducing the likelihood of additional spins is key to time-to-market. In addition to the performance results, ARM also analyzes the power draw of the subsystem and includes this in the report. Also, ARM physical design team does preliminary floor planning and some timing constraint analysis for target process technology. In effect, it helps our partners understand die size and cost implications which ultimately ensure their design will meet customer’s expectations.
In addition to giving our partners a head start on hardware design and understanding PPA (Performance, Power, and Area) targets, the subsystem also comes with a reference software package. The subsystem has been built to adhere to industry server standards (e.g. UEFI, ACPI). The reference software includes ARM Trusted Firmware and UEFI source code ported to the subsystem, ACPI tables populated for the subsystem, and any patches needed to run latest Linux kernel along with release documentation and guide on how to use the software. In addition, a Fixed Virtual Platform (FVP) of the subsystem is available. The FVP is a fast, accurate platform built on fast models that helps our partner’s software development activities. The software developed for the subsystem isready-to-run on the FVP. Delivering this reference software stack along with the optimized RTL allows silicon partners to more rapidly develop the necessary software to allow booting an OS as soon as silicon arrives. On the hardware side, the subsystem also includes a technical reference manual that describes the various pieces of the subsystem in detail, implementation guide, and integration manual. All of this documentation is delivered along with the RTL to help our partners quickly understand the subsystem. This is critical in enabling SoC designers to get up to speed fast and devote as much time and resource as possible on differentiating their design through proprietary IP, customized software, or a mixture of both.
As I mentioned previously, the ever increasing data processing requirements that are occurring due to the continued electronics revolution have big implications for datacenters. It means that mainstream server SoCs are becoming increasingly complex every year. In addition, companies are replacing their server platforms at an unprecedented rate. This requires our silicon partners to deliver more capable SoCs faster. ARM has been enabling our partners with ARM processors and System IP that can be leveraged to deliver server SoCs. The ARM subsystem now takes this enabling activity a step further by giving our partners a fast start to their SoC development. By providing a pre-integrated, pre-verified foundation, it reduces the entry barriers for the ARM ecosystem to enter the changing server market and develop optimized SoCs for their target applications. For more information please contact me directly here via the comments below or private message and I’ll make sure to get back to you.