Gain up to 30% Cost-Performance benefits for Apache Kafka on AWS Graviton2 Processors

September 15, 2021

4 minute read time.

Co-Authors: Kailas Jawadekar & Julio Suarez, Infrastructure Line of Business, Arm

In today’s interconnected world, streaming data can be generated from many sources such as applications, logs, web servers, devices, and things. Streaming data is essentially continuously generated small sized events, often with low latency requirements. For businesses, processing these events on a real-time basis is critical as the value of this data diminishes significantly over time.

There is a major shift underway from traditional systems that are focused more on batch processing to event driven systems that are designed to process, transform, and act on streaming data events in a real-time fashion.

As an example, let us consider an e-commerce website where visitors to the website are clicking various pages, searching for things, and adding things in cart. These are events, and you need real time processing and analysis of these events to gather insights on what the users are doing and how products are performing. Based on these insights, you may decide to provide real time promotions or suggestions, to entice the users to purchase certain items.

Similarly, there are other examples, such as a credit card company tracking your card transactions for fraud detection, surveillance systems detecting events of interest at a particular location. Also, industrial IoT sensors generating all sorts of events at the edge, and you may want the system to trigger certain actions based on those events.

Such use-cases lend themselves very well for a platform such as Apache Kafka. In fact, Apache Kafka has become the de-facto standard for event streaming platform. Apache Kafka is an open-source event-streaming platform that supports real-time decision-making in a broad range of industrial, financial, and consumer applications. Today, Apache Kafka is used by 1000s of companies including 35% of the Fortune 500 companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. The preferred deployment model for many of these companies is deploying Apache Kafka on a cloud service provider such as Amazon Web Services.

On AWS, you can run Kafka self-managed on EC2 or on top of Kubernetes. Apache Kafka deployment and management can be complicated, so if you want to take the complexity and operational overhead off your plate, you can also use Amazon Managed streaming for Kafka (MSK).

Apache Kafka can process trillions of events a day and needs to run on memory-intensive instances that can quickly and efficiently process very large data sets. Amazon EC2 offers several memory-optimized instances that can support Apache Kafka workloads, but the runtime costs can vary quite a bit, depending on the instance.

In this blog, we conducted performance benchmarking of self-managed Apache Kafka running on 64-bit Arm Neoverse based AWS Graviton2 processors. AWS Graviton2 processors based on 64-bit Arm Neoverse N1 cores are custom CPUs built using the modern 7nm technology. These CPUs are built from the ground up and are specifically designed and optimized for cloud-native workloads. Highly efficient Arm Neoverse based architecture helps AWS Graviton2 offer up to 40% better price performance for various workloads when compared to current generation X86 based instances.

The following diagram represents our setup of a three node Kafka cluster and a three node Zookeeper cluster. This is used to track the status of nodes in the Kafka cluster and maintain a list of Kafka topics and messages.

Diagram: A three node Kafka cluster and a three node Zookeeper cluster

Our benchmark measures the throughput and latency of writing and reading events on a Kafka cluster. The throughput metric we use is Records Per Second (RPS), and the latency metric is the 99-percentile latency in milliseconds (ms). The previous diagram shows that the Kafka cluster is composed of m6g instances. However, as part of our testing we also tested by creating three-node clusters with various other instance types and sizes.

As part of our testing, we first measured the producer RPS numbers across various instances. Then to explore performance/$, we divided the RPS numbers by the instance costs for the respective instances. The result is shown in the following graph.

Diagram: Performance divided by cost

The cost-performance data shows that the xlarge instances are a better value for running Kafka and of the xlarge instances. We observed that the AWS Graviton2 based Amazon EC2 r6g.xlarge instances provide the best value. On average the r6g.xlarge instances provide about 30% Performance/$ advantage over x86 based Amazon EC2 r5.xlarge, and about 19% cost-performance advantage over the Amazon EC2 r5a.xlarge instances.

For a more detailed information on our testing environment setup, benchmarks, and results, we encourage you to download the following whitepaper, Benchmarking Apache Kafka on AWS Graviton2.

Read the whitepaper

Visit the AWS Graviton page for customer stories on adoption of Arm-based processors. For any queries related to your software workloads running on Arm Neoverse platforms, feel free to reach out to us at sw-ecosystem@arm.com. Join us at Arm DevSummit 2021 to learn more about cloud workloads and software ecosystem support for Arm Neoverse platforms.

0 comments
0 members are here

Servers and Cloud Computing blog

How SiteMana scaled real-time visitor ingestion and ML inference by migrating to Arm-based AWS Graviton3

Peter Ma

Migrating to Arm-based AWS Graviton3 improved SiteMana’s scalability, latency, and costs while enabling real-time ML inference at scale.
- July 4, 2025
Arm Performance Libraries 25.04 and Arm Toolchain for Linux 20.1 Release

Chris Goodyer

In this blog post, we announce the releases of Arm Performance Libraries 25.04 and Arm Toolchain for Linux 20.1. Explore the new product features, performance highlights and how to get started.
- June 17, 2025
Harness the Power of Retrieval-Augmented Generation with Arm Neoverse-powered Google Axion Processors

Na Li

This blog explores the performance benefits of RAG and provides pointers for building a RAG application on Arm®︎ Neoverse-based Google Axion Processors for optimized AI workloads.
- April 7, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Gain up to 30% Cost-Performance benefits for Apache Kafka on AWS Graviton2 Processors

How SiteMana scaled real-time visitor ingestion and ML inference by migrating to Arm-based AWS Graviton3

Arm Performance Libraries 25.04 and Arm Toolchain for Linux 20.1 Release

Harness the Power of Retrieval-Augmented Generation with Arm Neoverse-powered Google Axion Processors