Co Authors: Pranay Bakre and Masoud Koleini
It has been more than a year since the Arm Neoverse powered AWS Graviton2 processors became generally available, and customers are deploying a wide range of applications and workloads to gain price and performance benefits. The workloads range from load balancers/reverse proxy and API gateways (NGINX), to search engines (Elasticsearch), to in-memory databases (Memcached). We recommend reading our series of performance blogs on these different categories of workloads.
Databases like Memcached and Redis are referred to as in-memory databases. These databases, unlike traditional databases that store data in disks or SSDs, are purpose-built to store data in memory. This typically results in faster response times and higher IOPS. Redis is an open-source, in-memory datastore that is often used as a database, caching system and message broker. It is widely used in industries for real-time applications such as healthcare systems, IoT, and financial services. Redis is highly scalable and used for real time analytics, caching, pub/subs applications, and session management.
In this blog, we compare the throughput and latency of Redis on AWS Graviton2-based R6g instances to Intel Xeon-based R5 instances across of range of instance sizes to see which offers better Redis performance.
For benchmarking setup, we used GNU Compiler Collection (GCC) version 10.2.0. Arm in collaboration with its partners and the GCC community have worked to significantly increase the performance with GCC 10 release. We compiled the Redis server from its source repository with GCC 10.2 before executing the benchmarking tests.
Using these tests, we observed up to 35% performance benefit of running an open-source Redis database on AWS Graviton2 based instances compared to equivalent x86-based instances. We also observed more than twice the number of operations/second output values from Redis deployed on Arm-based Amazon EC2 R6g instances compared to x86-based Amazon EC2 R5 instances. Additionally, we observed significantly lower latency values for similar operations.
We used Memtier as the load generator and performance benchmarking tool. It is an open-source high-throughput benchmarking tool for Redis built by Redis Labs. Memtier was deployed on separate EC2 instances in the same VPC as Redis instances
Each test run generated 5 threads with 50 clients per thread, which gave 250 simultaneous connections (sessions). That added up to 2.5 million requests sent from Memtier on each run. Default pipeline value (1) was used during each test run. Pipelining is used to increase the throughput of the application. For bulk data transfers and achieving higher throughput, pipeline values greater than 1 can be considered. This github repo contains all the scripts required to create the test infrastructure and steps to execute the benchmarks.
The result shown in the following tables are an aggregated result of 30 consecutive test runs.
Let us look at the performance numbers of self-hosted Redis on R6g and R5 instances. We compared the throughput (operations/sec) and latency (lower is better) values after multiple test runs.
Performance gain (%)
Table 1: Redis throughput performance results on R5 vs R6g
R5 (ms latency)
R6g (ms latency)
Table 2: Redis average latency performance results on R5 vs R6g
The throughput and latency performance comparison graphs for R5 and R6g instances are shown in the following figures.
Figure 1: Performance gain for R6g vs R5 instances for self-hosted Redis deployment.
Figure 2. Lower latency for R6g vs R5 instances for self-hosted Redis deployment.
To conclude, Redis deployed on AWS Graviton2 provides up to 35% more throughput, with 24% reduced latency and a 20% cost benefit compared to the equivalent x86 based EC2 instances. Deploying applications on these instances is simple and efficient without the need for major complexities. For details on how to migrate existing applications to AWS Graviton2, please check this github page.
Visit the AWS Graviton page for customer stories on adoption of Arm-based processors. For any queries related to your software workloads running on Arm Neoverse platforms, feel free to reach out to us at firstname.lastname@example.org.
Thanks for the informative blog entry!
My team just migrated our Elasticache Redis nodes to Graviton 2 (r5.xl -> r6g.xl). We're seeing a drop in CPU which I guess can be a proxy for throughput, but our latency didn't decrease.
I wanted to ask if your drop in latency may have been caused by CPU throttling when maxing out throughput. Our CPU Util is < 10% so we're not close to maxing. If you have any ideas why my results may have been different than yours I'd really love to hear it.