A key trend in modern data centers is implementation of software defined storage, like the open-source software distribution, Ceph. The open source software community was an early adopter in moving workloads to Arm Neoverse. All types of application users are experiencing performance benefits and cost savings by switching to Arm-based platforms, like Lenovo platforms that are based on Ampere Computing CPUs.
Delivering better-than-x86 performance at lower TCO is a value proposition that we set out to establish with Arm Neoverse. Our news today is no exception. Last November, a group within SUSE submitted the first Ceph-based result for a storage and metadata benchmark called the IO500 10 Node Challenge, achieving a score of 12.43 using a Xeon Gold 6142-based cluster. Through a six-way collaboration between Arm, Ampere, the same group at SUSE, Mellanox (Nvidia), Micron, and Broadcom, we thought we could do better. And today we're excited to announce that an Arm-based cluster, using Ampere eMAG CPUs, achieved a Ceph-based score of 15.61, consuming far less power and at considerably lower price, on the IO500 10 Node Challenge benchmark.
If you are new to Ceph, here’s some background. Ceph enables deployment of distributed storage systems that are designed for scalability, reliability, and performance. A Ceph cluster can be run on commodity servers over a common network like Ethernet. Ceph clusters scale up well to thousands of servers and into the petabyte range.
Although achieving this result was a group effort, Arm contributed with multiple years of incremental improvements on Ceph and other related open-source software projects. These contributions include: • 100+ upstreamed patches to improve Ceph storage ecosystem on Arm servers, covering multiple open source communities includes Ceph, Ceph-CSI, SPDK, DPDK, ISA-L, and OpenStack. • Boosted Ceph performance on Arm with optimizations in its common routines like string handling, dcache hashing, and CRC32. • Added 64KB kernel page support to Ceph. This support is a unique feature on Arm, which enhanced SPDK integration with considerable performance uplift achieved.
The IO500 benchmark was established in 2017 to compliment the TOP500 benchmark (recently topped by the Arm-based Fugaku supercomputer) but with a focus on storage sub-system performance. While the IO500 test aims for maximum performance from an unbounded number of clients and servers, the ten-node challenge limits clients to ten. This challenge focuses on achieving the best storage throughput and metadata performance from a smaller set of systems. This challenge also demonstrates that, if the performance of Ceph on Arm is good enough for HPC workloads, it should also be suitable for a large portion of the enterprise storage market.
Ceph may not be the first name you think of when it comes to high-performance computing storage filesystems. However, Ceph is seeing broader consideration and adoption in HPC but also in media, telecommunications, cloud computing, and elsewhere. Similarly, Ampere Computing may not be a household name (yet) but on the IO500 benchmark 10 Node Challenge, Ampere Computing’s eMAG CPU has shown that it can offer more performance on a Ceph-based cluster (see Figure 1) while offering significant CapEx savings (see Figure 2) over last November's Xeon-based alternative1.
The test cluster setup that we used for this benchmark includes:• Memory and NVMe-based SSDs from Micron• An NVMe storage controller from Broadcom• Dual 100GbE networking from Mellanox (Nvidia)
We chose NVMe-based storage for this test because that is what more and more customers are choosing. Although 100GbE networking might still be on the leading edge, there is broad acknowledgement that faster networking will be required to keep up with an ever increasing deluge of data.
So what did we learn? First, out-of-the-box Ceph runs well on the Ampere eMAG CPU, showing a 26% performance improvement over the Intel Xeon Gold 6142 comparison cluster. It also consumes far less power under test. The Arm-based cluster consumed, at most, 152 Watts per server. This is more than 50% lower than the 310 W that SUSE observed on the Xeon-based servers. This is important for storage environments, because reduced ambient temperatures can greatly improve the reliability of HDD and SSD-based storage devices. And have I mentioned the potential 40% CapEx savings?
We are grateful to all of the partners involved in achieving this result. We would like to send a special thanks to the team at SUSE who maintained the cluster and performed all of the testing. You can read more details about setting up, tuning and running the cluster in SUSE's CephFS blog.
Learn about Arm Neoverse
Cluster configuration details:
2. Cluster price is estimated “street pricing” for both clusters obtained from public sources such as CDW.com, Lenovo.com and Newegg.com during the month of July 2020.