Co-authors: Andrea Pellegrini – distinguished Engineer, Arm Infrastructure Line of Business, Steve Demski - Marketing Manager, Hyperscale, and HPC, Arm Infrastructure Line of Business.
One of the fundamentals measurements of goodness for a computer system is performance. This is often defined as the ratio between amount of work completed by the machine and the time necessary to complete it. For server and networking systems this is often a measure of throughput. For example, “how many e-commerce transactions per second a system can handle?”. This definition is often too simple for real world scenarios, since systems are often required to complete work within a certain latency. And more complete performance evaluations require measuring how much throughput is achieved by a system within the boundary of a service level agreement (SLA). For the sake of simplicity, here we will ignore any latency constraints, and only focus on throughput-based performance metrics, such as the ones produced by benchmarks such as SPEC CPU2017 “rate”. For more details on Arm's use of SPEC CPU see our companion blog, The how and why of SPEC CPU estimates for Arm Neoverse cores and reference designs.
Three metrics are paramount when we evaluate performance of a computer system in a datacenter:
With all flavors of cloud computing (public, private, hybrid) becoming the standard for IT services delivery, let us look at each of these three metrics in the context of available, modern CPUs.
With some exceptions, the most common type of cloud CPU uses a high core-count in combination with Simultaneous Multi-Threading (SMT) and often some amount of “Turbo” capability. Depending on the legacy vendor, these CPUs can score well on per-socket performance or on per-thread performance, but they rarely perform well on both simultaneously. Additionally, performance variability per thread can vary widely, depending on “noisy neighbors”, simultaneous thread competing for core resources, and the inherent unpredictability of “Turbo” modes.
With the Arm Neoverse family of CPU platforms we believe we offer a better cloud computing solution – both to the cloud operators and to their customers.
These advantages are illustrated graphically in figure 1.
Figure 1: Performance projection of Arm Neoverse vs. traditional CPU architectures based on a industry standard integer benchmark. Performance per socket is plotted along the X-axis, and performance per thread is plotted along the Y-axis. Designs that achieve the best scores on these two metrics will land on the top right portion of the graph. For the sake of simplicity, we do not show the third metric, performance variability per thread, on this graph.
By plotting performance per socket and performance per thread on a X-Y graph, we can compare our designs vs other competing parts within comparable silicon area and Thermal Design Power (TDP) envelops.
When we look at the evolution of Arm Neoverse platforms under this light, it becomes clear that the Neoverse N1 platform is still a leader in terms of performance per thread on typical cloud instances. Here we listed our projections based on simulated 64-core Neoverse N1 system, but higher core count Neoverse N1 systems are available on the market and can push further on aggregate performance per socket. The two new products we launch today, Neoverse V1 and Neoverse N2 provide two different ways to improve on both these metrics, enabling Arm partners to further the lead in the market on these performance metrics.
Why is achieving both high performance per socket and high performance per thread important? For cloud operators, a higher core count means you can host more customers per system and amortize cost over more users. That is a dual benefit – more revenue, less cost. But the same is true for the cloud customer. They benefit from predictable, scalable performance – getting exactly what they pay for – and from lower cost economics of Arm Neoverse.
Today we are launching the Arm Neoverse V1 and Neoverse N2 platforms. And we expect to see Arm partner silicon in market by the end of this year. We are excited to see how Arm’s partners turn this innovation and performance into solutions built for HPC, cloud, networking, edge and 5G markets.
[CTAToken URL = "https://www.arm.com/products/silicon-ip-cpu/neoverse" target="_blank" text="Learn more about Neoverse V1 and N2" class ="green"]