Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Servers and Cloud Computing blog Gain up to 35% performance benefits for deploying Redis on AWS Graviton2
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Graviton2
  • Neoverse N1
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Gain up to 35% performance benefits for deploying Redis on AWS Graviton2

Pranay Bakre
Pranay Bakre
July 20, 2021
3 minute read time.


Co Authors: Pranay Bakre and Masoud Koleini
 

It has been more than a year since the Arm Neoverse powered AWS Graviton2 processors became generally available, and customers are deploying a wide range of applications and workloads to gain price and performance benefits. The workloads range from load balancers/reverse proxy and API gateways (NGINX), to search engines (Elasticsearch), to in-memory databases (Memcached). We recommend reading our series of performance blogs on these different categories of workloads.  

Databases like Memcached and Redis are referred to as in-memory databases. These databases, unlike traditional databases that store data in disks or SSDs, are purpose-built to store data in memory. This typically results in faster response times and higher IOPS. Redis is an open-source, in-memory datastore that is often used as a database, caching system and message broker. It is widely used in industries for real-time applications such as healthcare systems, IoT, and financial services. Redis is highly scalable and used for real time analytics, caching, pub/subs applications, and session management.

In this blog, we compare the throughput and latency of Redis on AWS Graviton2-based R6g instances to Intel Xeon-based R5 instances across of range of instance sizes to see which offers better Redis performance.

Performance benchmarking setup and results

For benchmarking setup, we used GNU Compiler Collection (GCC) version 10.2.0. Arm in collaboration with its partners and the GCC community have worked to significantly increase the performance with GCC 10 release. We compiled the Redis server from its source repository with GCC 10.2 before executing the benchmarking tests.

Using these tests, we observed up to 35% performance benefit of running an open-source Redis database on AWS Graviton2 based instances compared to equivalent x86-based instances. We also observed more than twice the number of operations/second output values from Redis deployed on Arm-based Amazon EC2 R6g instances compared to x86-based Amazon EC2 R5 instances. Additionally, we observed significantly lower latency values for similar operations.

We used Memtier as the load generator and performance benchmarking tool. It is an open-source high-throughput benchmarking tool for Redis built by Redis Labs. Memtier was deployed on separate EC2 instances in the same VPC as Redis instances

Component name

Version

Redis 6.0.9
GCC version 10.2.0
Memtier benchmarking tool 1.3.0
Operating System Ubuntu 20.04

Input parameter

Value

Number of threads 5
Number of clients per thread 50 
Number of requests per client 10k
Number of consecutive tests runs 10
Data size 128
Protocol Redis
Key pattern Sequential
Pipeline 1

Each test run generated 5 threads with 50 clients per thread, which gave 250 simultaneous connections (sessions). That added up to 2.5 million requests sent from Memtier on each run. Default pipeline value (1) was used during each test run. Pipelining is used to increase the throughput of the application. For bulk data transfers and achieving higher throughput, pipeline values greater than 1 can be considered. This github repo contains all the scripts required to create the test infrastructure and steps to execute the benchmarks.

The result shown in the following tables are an aggregated result of 30 consecutive test runs. 

Let us look at the performance numbers of self-hosted Redis on R6g and R5 instances. We compared the throughput (operations/sec) and latency (lower is better) values after multiple test runs. 

Instance size

R5 (Operations/Sec)

R6g (Operations/Sec)

Performance gain (%)

Large 142653.43 192730.22 35%
XLarge 145666.72 193117.02 32%
2XLarge 167997.1 199732.16 18%

Table 1: Redis throughput performance results on R5 vs R6g

Instance size

R5 (ms latency)

R6g (ms latency)

Performance gain (%)

Large 1.75 1.32 24%
XLarge 1.71 1.29 24%
2XLarge 1.49 1.25 16%

Table 2: Redis average latency performance results on R5 vs R6g

The throughput and latency performance comparison graphs for R5 and R6g instances are shown in the following figures.

Up to 35% better Redis throughput on AWS Graviton2

Figure 1: Performance gain for R6g vs R5 instances for self-hosted Redis deployment.

Up to 24% reduced Redis latency on AWS Graviton2

Figure 2. Lower latency for R6g vs R5 instances for self-hosted Redis deployment.

Summary

To conclude, Redis deployed on AWS Graviton2 provides up to 35% more throughput, with 24% reduced latency and a 20% cost benefit compared to the equivalent x86 based EC2 instances. Deploying applications on these instances is simple and efficient without the need for major complexities. For details on how to migrate existing applications to AWS Graviton2, please check this github page.

Visit the AWS Graviton page for customer stories on adoption of Arm-based processors. For any queries related to your software workloads running on Arm Neoverse platforms, feel free to reach out to us at sw-ecosystem@arm.com.

Anonymous
  • Noobie
    Noobie over 2 years ago

    Thanks for the informative blog entry!

    My team just migrated our Elasticache Redis nodes to Graviton 2 (r5.xl -> r6g.xl).  We're seeing a drop in CPU which I guess can be a proxy for throughput, but our latency didn't decrease.

    I wanted to ask if your drop in latency may have been caused by CPU throttling when maxing out throughput.  Our CPU Util is < 10% so we're not close to maxing.  If you have any ideas why my results may have been different than yours I'd really love to hear it.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Servers and Cloud Computing blog
  • How SiteMana scaled real-time visitor ingestion and ML inference by migrating to Arm-based AWS Graviton3

    Peter Ma
    Peter Ma
    Migrating to Arm-based AWS Graviton3 improved SiteMana’s scalability, latency, and costs while enabling real-time ML inference at scale.
    • July 4, 2025
  • Arm Performance Libraries 25.04 and Arm Toolchain for Linux 20.1 Release

    Chris Goodyer
    Chris Goodyer
    In this blog post, we announce the releases of Arm Performance Libraries 25.04 and Arm Toolchain for Linux 20.1. Explore the new product features, performance highlights and how to get started.
    • June 17, 2025
  • Harness the Power of Retrieval-Augmented Generation with Arm Neoverse-powered Google Axion Processors

    Na Li
    Na Li
    This blog explores the performance benefits of RAG and provides pointers for building a RAG application on Arm®︎ Neoverse-based Google Axion Processors for optimized AI workloads.
    • April 7, 2025