January 5, 2023

Improve Apache httpd performance up to 40% by deploying on Alibaba Cloud Yitian 710 instances

In this blog, we look at the advantages of using Alibaba Yitian 710 CPU Arm-based instances for Apache httpd compared to x86-based instances.

By Martin Ma

Open Source Software Server and Infrastructure

Reading time 4 minutes

Introduction

Apache httpd is one of the most popular web servers, which is a software program that usually runs in the background, as a process. It plays the role of a server in a client-server model using the HTTP or HTTPS network protocols.

In this blog, we compare the Apache httpd throughput on two Alibaba Cloud Elastic Compute Service (ECS) instances. These instances are ECS g8y (powered by Yitian 710 processors based on Armv9 architecture) and g7 (powered by 3rd Generation Intel Xeon Scalable processors). Our findings demonstrate that httpd deployments on g8y instances can achieve up to 40% performance advantage over g7 instances. The following sections cover the details of our testing methodology and results.

Performance benchmark setup and result

For benchmark setup, there is one instance as load generator and one instance under test. We use wrk as the benchmark tool to generate the load and collect throughput to compare the performance between g8y and g7 instances.

The following table shows the configuration of the tested instances:

Instance type	Instance size (vCPU)	Memory (GiB)	Storage
g8y	2xlarge (8)	32	40GB (ESSD PL0 2280 IOPS)
g7	2xlarge (8)	32	40GB (ESSD PL0 2280 IOPS)

The software versions and test parameters are as following:

Software	Version
Apache httpd	2.4.37
Operation system	Alibaba Cloud Linux 3.2104 LTS
Kernel	5.10.134-12.al8.aarch64 5.10.134-12.al8.x86

httpd default Multi-Processing Module (MPM) is event. It is designed to allow more requests to be served simultaneously by passing off some processing work to the listener threads. This action frees up the worker threads to serve new requests.

The following table shows the configuration of httpd that were tested:

MPM event parameters	StartServers	8
	ServerLimit	100
	ThreadsPerChild	125
	MaxRequestWorkers	2000
	ThreadLimit	2000
	MaxSpareThreads	1000
Persistent connection parameters	KeepAlive	On
	MaxKeepAliveRequests	0
	KeepAliveTimeout	50
Disable submodules	brotli lua http2 http2-proxy

To achieve better performance, we set CPU affinity for httpd processes and threads as in the following diagram.

CPU affinity for httpd setup

The benchmark tool (wrk) runs on a single g8y.4xlarge instance. Each test creates 32 threads which send the request through the configured 1000 keep-alive HTTP/HTTPS connections, with a 30 second duration. The following tables show wrk version and test cases:

Software	Version
wrk version	4.0.2
Threads	32
Connections	1000
Durations	30 seconds

Test Case	Command
HTTP persistent connection	wrk -t 32 -c 1000 -d 30 --latency http://$serverIP
HTTPS persistent connection	wrk -t 32 -c 1000 -d 30 --latency https://$serverIP:443

Test Result

The throughput results are the average of 10 consecutive tests after one warmup test. Running httpd with logging disabled on g8y.2xlarge instances compared to g7.2xlarge instances we observe a 39.6% performance uplift for HTTP persistent connections and a 26.7% performance uplift for HTTPS persistent connections.

The following table shows throughput comparison (logging disabled) between g8y.2xlarge and g7.2xlarge.

Test Case	g8y.2xlarge (Requests/Sec)	g7.2xlarge (Requests/Sec)	Performance gain
HTTP persistent connection	243138.93	174186.11	39.6%
HTTPS persistent connection	172087.59	135807.16	26.7%

Table 1: Throughput results (logging disabled) on g8y and g7

Figure 1. Throughput (logging disabled) performance gains for g8y vs. g7

To effectively manage a web server, httpd provides logging capabilities to get feedback about the activity and performance of the server and any problems that may be occurring. To achieve better performance when logging is enabled, we set the parameter “BufferedLogs On”. This parameter is used to buffer log entries in memory before writing to disk. Running httpd with logging enabled on g8y.2xlarge instances compared to g7.2xlarge instances we observe a 40.0% performance uplift for HTTP persistent connections and a 27.1% performance uplift for HTTPS persistent connections.

The following table shows throughput comparison (logging enabled) between g8y.2xlarge and g7.2xlarge.

Test Case	g8y.2xlarge (Requests/Sec)	G7.2xlarge (Requests/Sec)	Performance gain
HTTP persistent connection	234099.50	167237.14	40.0%
HTTPS persistent connection	163650.42	128793.82	27.1%

Table 2: Throughput results (logging enabled) on g8y and g7

Figure 2. Throughput (logging enabled) performance gains for g8y vs. g7

Conclusion

By deploying Apache httpd on Yitian 710-based instances compared to deploying on 3rd generation Xeon Scalable processor-based instances, we see several benefits:

A 40% throughput performance advantage for HTTP persistent connections
A 27% throughput performance advantage for HTTPS persistent connections
A 20% cost benefit

Please visit this page for details on how to migrate existing applications to Yitian 710 based instances. For any queries related to your software workloads running on Arm platforms, feel free to reach out to us at sw-ecosystem@arm.com.

More Cloud Workloads on Arm

By Martin Ma

Article text

Re-use is only permitted for informational and non-commerical or personal use only.