Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Tools, Software and IDEs blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
Infrastructure Solutions blog Improve Memcached performance up to 41% with Alibaba Cloud Yitian 710 instances
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Cloud Computing
  • Open Source Software
  • Server and Infrastructure
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Improve Memcached performance up to 41% with Alibaba Cloud Yitian 710 instances

Ker Liu
Ker Liu
March 14, 2023
2 minute read time.

Memcached is an open source, high-performance, distributed memory object caching system. It is a popular choice for powering real-time applications in web, mobile apps, gaming, ad-tech, and e-Commerce. Memcached is an in-memory key-value store that offers higher application performance by removing the need to access disks or SSDs. By keeping its data in memory, it avoids delays and can access data much faster than traditional disk-based databases.

In this blog, we compare the throughput of Memcached on two types of Alibaba Cloud ECS instances, to show the performance advantage of Arm. G8y instances, powered by the Alibaba Yitian 710 processor based on Armv9, represent Arm. G7 instances, powered by 3rd Generation Intel Xeon Scalable processors, represent x86.

Benchmark setup and results

We used Memtier as the load generator and performance benchmarking tool. It is an open-source high-throughput benchmarking tool for Memcached. Memtier was deployed on separate ECS instance.

For the Memcached server, we deployed multiple Memcached processes on each core.

Memcached benchmarking topology

Figure 1. Memcached benchmarking topology

The server under test has two ECS instances with the following configurations. The benchmark client used a single G8y.8xlarge instance.

Processor ECS type
Yitian 710 G8y.2xlarge
The 3rd Generation Xeon G7.2xlarge

Table 1. Test server configurations

The benchmark tests were performed with the following software versions and test parameters.

Component name Version
Memcached 1.5.22
GCC version 10.2.1 20200825 (Alibaba 10.2.1-3 2.32)
Memtier benchmarking tool 1.4.0
Operating system Alibaba Cloud Linux 3.2104 LTS

 

Test config parameter Value
Number of Memtier clients 8
Number of threads 8
Number of clients per thread 10
Number of consecutive tests runs 3
Data size 128
Memcached protocol text
Key pattern random
Pipeline 1, 50, 100

We use 8 Memtier clients to generate requests for 8 Memcached processes simultaneously, each Memtier client created 8 threads with 10 clients per thread, which gave 80 simultaneous connections (sessions). Pipeline 1, 50 and 100 was used in this test. Pipeline values greater than 1 can be used for bulk data transfers to increase the throughput of the application.

After enabling XPS (transmit packet steering), RPS (receive packet steering) and RFS (receive flow steering), the performance on both instances can be improved. We observed up to 41% performance benefit of running a Memcached database on Yitian 710 based instances compared to equivalent x86-based instances. The result shown in the following tables is an aggregated result of 30 consecutive test runs. 

Let us look at the performance numbers of Memcached on G8y and G7 instances. We compared the throughput (Operations/Sec) values after multiple test runs. 

Pipeline parameter G7.2x (Operations/Sec) G8y.2x (Operations/Sec) Performance gain (%)
Pipeline=1 1256257.41 1482112.07 18%
Pipeline=50 4870840.43 6484505.32 33%
Pipeline=100 5241900.43 7379739.17 41%

Table 2. Memcached throughput performance results on G8y vs. G7

 Memcached performance gains for G8y vs. G7 instances

Figure 2. Performance gains for G8y vs. G7 instances

Conclusion

To conclude, Memcached deployed on Yitian 710 based ECS provides up to 41% more throughput compared to equivalent x86-based ECS instances. In addition, G8y instances are priced 20% less than comparable G7 instances. 

More workload blogs

Anonymous
Infrastructure Solutions blog
  • Improve Memcached performance up to 41% with Alibaba Cloud Yitian 710 instances

    Ker Liu
    Ker Liu
    In this blog we demonstrate the advantage of running Memcached on Arm-based Alibaba Yitian 710 instances over x86-based instances.
    • March 14, 2023
  • Spark on AWS Graviton2 best practices: K-Means clustering case study

    Masoud Koleini
    Masoud Koleini
    This report provides an in-depth tuning guide for running a Spark application on a Graviton EC2 instance cluster. And we make recommendations to improve performance and reduce cost.
    • March 7, 2023
  • Arm Neoverse V1 – Top-down Methodology for Performance Analysis & Telemetry Specification

    Jumana Mundichipparakkal
    Jumana Mundichipparakkal
    In this blog we introduce the Arm Neoverse V1 Performance Analysis Methodology whitepaper.
    • February 6, 2023