Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Servers and Cloud Computing blog Memcached performance benchmarking on AWS Graviton2 reveals over 50% price-performance gains
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • aws
  • Graviton2
  • Arm64
  • infrastructure
  • Neoverse
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Memcached performance benchmarking on AWS Graviton2 reveals over 50% price-performance gains

Pranay Bakre
Pranay Bakre
January 28, 2021
3 minute read time.

As highlighted at the AWS re:Invent 2020, many customers are reaping the price-performance benefits of using  the Arm Neoverse powered AWS Graviton2 processors. Amazon EC2 M6g, C6g, R6g, and C6gn instances provide up to 40% better price performance over x86-based Amazon EC2 instances.

In this blog, we showcase 51% performance benefits of running Memcached workloads on AWS Graviton2 based instances compared to equivalent x86-based instances. We are publishing a series of workload performance benchmarks on Graviton2 and recommend checking out our NGINX, Elasticsearch blogs.

Memcached is an open-source, high-performance, distributed memory object caching system and is a popular choice for powering real-time applications in web, mobile apps, gaming, ad-tech, and e-Commerce. It is an in-memory key-value store that offers higher application performance by removing the need to access disks or SSDs. By keeping its data in memory, it avoids delays and can access data much faster than traditional disk-based databases.

For performance benchmarking of Memcached we used memory-optimized Amazon EC2 R6g instances. These instances are ideal for running memory intensive workloads like Memcached. We installed Memcached open-source binaries on R6g and R5 instances and observed the following throughput and latency results:

 Figure 1: Performance gain for R6g vs R5 instances for self-hosted Memcached deployment

Figure 1: Performance gain for R6g vs R5 instances for self-hosted Memcached deployment

 Figure 2: Lower latency for R6g vs R5 instances for self-hosted Memcached deployment.

Figure 2: Lower latency for R6g vs R5 instances for self-hosted Memcached deployment.

Performance benchmarking process and results

We deployed Memtier, an open-source high-throughput benchmarking tool for Memcached, on separate EC2 instances in the same VPC as Memcached instances. Each instance of Memcached received an identical load based on the parameters below.

Figure 3: Performance benchmarking test setup

Following input parameters were used while performing the benchmarking tests on R5 and R6g EC2 instances.

Input parameter  Value 
Number of threads  5 
Number of clients per thread  100
Number of requests per client   10k 
Number of consecutive tests runs  20 
Key size  16 
Data size  128 
Memcached protocol text 
Key pattern  random 

Table 1 Input parameters for benchmarking tests

Each test run generates 5 threads with 100 clients per thread, which gives 500 simultaneous connections (sessions). That adds up to 5 million number of requests sent each run from Memtier. The results in Table 1 show throughput (Operations/Sec) and latency (lower is better) values.

Instance type Operations/Sec p50 Latency (ms) p99 Latency (ms)
r5.2xlarge 261803.93 1.511 3.215
r6g.2xlarge 394734.65 1.271 2.799
Performance Gain (%) 50.77% 15.88% 12.93%

Table 2 Memcached performance results on R5 vs R6g

Configurations

Following are the pre-requisites to build the test setup.

Component name Version 
Memcached 1.6.6
Memtier benchmark tool 1.3.0 
Operating System Ubuntu 20.10

Memcached 1.6.6 requires Ubuntu 20.10 and may not be installed via the APT repository. Download the binary file (arm64 and x86_84) manually and then install it on respective EC2 instances (R6g and R5). Follow the steps mentioned here to install the Memtier benchmarking tool.

A sample input command to execute the Memtier tests is mentioned in the following command:

memtier_benchmark -s <Memcache_IP_address> -p 11211 --protocol=memcache_text --clients=100 --threads=5 --ratio=1:1 --key-pattern=R:R --key-minimum=16 --key-maximum=16 --data-size=128 --requests=10000 --run-count=20

CPU utilization metrics for R5 and R6g instances

While executing the benchmarking tests, we used Grafana to visualize the CPU utilization per core and overall CPU utilization metrics for each of the instances. Figure 4 and 5 show the utilization charts for R5 and R6g instances.

Figure 4: CPU utilization graph for R5 instances.

Figure 4: CPU utilization graph for R5 instances.

 Figure 5: CPU utilization graphs for R6g instances.

Figure 5: CPU utilization graphs for R6g instances.

Conclusion

Deploying Memcached on AWS Graviton2 instances can provide 51% performance improvements and in addition a 20% cost benefit. Deploying new applications on AWS Graviton2 is simple and quick, allowing these gains to be realized without any complex migration requirements. For further details on getting started with AWS Graviton2, please visit this github page.

Feel free to reach out to us on sw-ecosystem@arm.com with any inquires related to running your software workloads on Arm Neoverse platforms.

Anonymous
  • Martin Grigorov
    Martin Grigorov over 4 years ago

    Check my findings from several months back: https://martin-grigorov.medium.com/compare-memcached-performance-on-x86-64-and-arm64-cpu-architectures-7fe781e34ab8 Memtier is not a good tool to load test Memcached. Better use github.com/.../mc-crusher

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Servers and Cloud Computing blog
  • Harness the Power of Retrieval-Augmented Generation with Arm Neoverse-powered Google Axion Processors

    Na Li
    Na Li
    This blog explores the performance benefits of RAG and provides pointers for building a RAG application on Arm®︎ Neoverse-based Google Axion Processors for optimized AI workloads.
    • April 7, 2025
  • Arm CMN S3: Driving CXL storage innovation

    John Xavier Lionel
    John Xavier Lionel
    CXL are revolutionizing the storage landscape. Neoverse CMN S3 plays a pivotal role in enabling high-performance, scalable storage devices configured as CXL Type 1 and Type 3.
    • February 24, 2025
  • Streamline Arm adoption with GitHub Copilot and Arm64 Runners

    Michael Gamble
    Michael Gamble
    The Arm for GitHub Copilot extension is here to change the way developers approach architecture migration.
    • February 19, 2025