astcenc 3.1: high performance texture compression

August 10, 2021

6 minute read time.

At SIGGRAPH last year we announced the release of astcenc 2.0, the first major update to the Arm ASTC texture compressor since the format was announced in 2012. This release gave developers a much needed performance boost, but we knew it was just the starting point on a longer journey. Here we are 12 months and 7 releases later, happy to announce the result of our work: astcenc 3.1.

More speed

For most developers, importing and compressing textures is one of the most time-consuming parts of a project build cycle. Fast compressors significantly improve developer efficiency and reduce iteration time, so it is no surprise that slow ASTC compression has been a major bugbear of developers for some time. The main goal of our work on the compressor has been to make the codec as fast as we could. However, we wanted to achieve this without sacrificing the image quality that astcenc provides, as this is one of the real strengths of the ASTC format.

The core codec has been extensively optimized, with nearly every path fine-tuned, and vectorized. Specific vectorized builds are now available for multiple CPU architectures:

Arm AArch64: Neon
x86-64: SSE2, SSE4.1, AVX2

The performance of the 3.1 release averages 5 times faster than the 2.0 release, and up to 17 times faster than the 1.7 release. The performance comes at a small image quality loss of around 0.1 dB compared to the 1.7 release for most block sizes. High-bitrate encodings, such as those using the 4x4 block size, actually improve image quality slightly despite the faster performance.

astcenc 3.1 performance compared to astcenc 1.7

To put this in absolute terms the latest compressor, compiled for AVX2 and running on an Intel core i5-9600K at 4.2GHz, can compress LDR color images with the following performance:

12M Texels/s for a fast search,
3M Texels/s for a medium search,
800K Texels/s for a thorough search.

The improvements in this release mean that thorough compression in astcenc 3.1 is around three times faster than medium compression in astcenc 1.7. It is now possible to get both better image quality and measurably faster compression passes. The performance of thorough compression is now good enough that it is feasible to get the best quality out of ASTC in real game builds, which was not possible in 1.7 due to the compression cost.

Fine-grained compression quality

One of the changes we have added this year is a more fine-grained control over the compressor’s performance-to-quality trade-off. Earlier releases supported only a set of defined preset quality levels (fastest, fast, medium, thorough, and exhaustive), each step increasing the compression cost by between 3 and 5 times. The presets are still supported, but it is now also possible to supply a numeric quality level between 0 (fastest) and 100 (exhaustive), giving developers the ability to finely control their compression cost.

Handy hint: Running astcenc with a compression effort of 30, which is between the -fast (quality 10) and -medium (quality 60) levels, is a close match for the performance and image quality of the ISPC TexComp ASTC texture compressor.

New features

While performance is important, it is not the only thing that matters, so we have also added a number of new features to the codec based on common developer requests.

CPU ISA invariant output

The compressor output is now invariant across CPU architectures and compilers, giving bit-identical output for a given input image and compressor configuration. This gives developers certainty about what their game is going to look like, no matter which build machine was used to run the asset pipeline. It also makes it a lot easier to integrate the compressor into automated test environments which perform bit-exact image comparisons to determine whether a test passes or fails.

RGBM compression

We have added support for the RGBM texture format, giving the compressor some awareness of RGBM and how it is consumed. RGBM is a container format which encodes a limited form of HDR data in an LDR wrapper, which can be converted back into HDR values using some shader arithmetic when the texture is consumed.

Handy hint: RGBM encoding is a means to support a limited form of HDR textures on GPUs without a native HDR texture format. All Mali GPUs that support ASTC implement the HDR feature profile, allowing HDR textures to be directly encoded for improved efficiency and image quality.

The traditional HDR-to-RGBM encode is:

Fullscreen

1
2
3
4
5
6
7
8
9
// Load HDR inputs in range [0-5]
float r_in = pixel_in.r / 5.0f;
…
// Extract multiplier, rescale RGB to fit range [0-1]
float m_enc = max(r_in, g_in, b_in);
float r_enc = min(1.0f, r_in / m_enc);
…
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

// Load HDR inputs in range [0-5]
float r_in = pixel_in.r / 5.0f;
…

// Extract multiplier, rescale RGB to fit range [0-1]
float m_enc = max(r_in, g_in, b_in);

float r_enc = min(1.0f, r_in / m_enc);
…

The traditional shader RGBM-to-HDR reconstruction is:

Fullscreen

1
2
3
4
5
// Load LDR inputs in range [0-1]
vec4 data = texture(…);
// Convert back to HDR in range [0-5]
data.rgb = data.rgb * data.a * 5.0;
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

// Load LDR inputs in range [0-1]
vec4 data = texture(…);

// Convert back to HDR in range [0-5]
data.rgb = data.rgb * data.a * 5.0;

RGBM has historically proven a challenge to compress well with ASTC, as the characteristics of RGBM data break a number of assumptions that the compressor makes about how texture data behaves. In particular, for dark pixels, the format will try to mix large RGB values and small M values. This leaves M prone to quantization during compression, which produces block artifacts caused by M-error-induced luminance shifts, and can often round to zero which results in completely black output blocks.

RGBM image with block artifacts.

To solve this issue the original user code that converts HDR data into RGBM, prior to compression with astcenc, should be modified to clamp the value of M above 16. This prevents the use of the very small M values which cannot be encoded reliably.

Fullscreen

1
2
3
4
5
6
7
8
9
10
// Load HDR inputs in range [0-5]
float r_in = pixel_in.r / 5.0f;
…
// Extract multiplier, but limit to >= 16
float m_enc = max(r_in, g_in, b_in, 16);
// Rescale RGB to fit range [0-1]
float r_enc = min(1.0f, r_in / m_enc);
…
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

// Load HDR inputs in range [0-5]
float r_in = pixel_in.r / 5.0f;
…

// Extract multiplier, but limit to >= 16
float m_enc = max(r_in, g_in, b_in, 16);

// Rescale RGB to fit range [0-1]
float r_enc = min(1.0f, r_in / m_enc);
…

The second RGBM-related change is that, during compression, astcenc can now compute the error in the decoded HDR domain rather than the encoded RGBM domain. This allows it to more accurately select block candidates with lower error. Used together these two changes give a much improved result:

RGBM image without block artifacts.

Using RGBM with astcenc is therefore a two-step process:

The user has the responsibility for applying the M limiting during the HDR-to-RGBM conversion, which is handled outside of the compressor.
During compression use the -rgbm <max> command-line option, where <max> is the maximum HDR value used during reconstruction, to minimize the compression error. In our previous code snippets this is the value 5.

Sprite sheet transparent “RDO” compression

We have added a limited form of rate distortion optimization (RDO) compression for textures with completely transparent regions. The aim of this technique is to make the compressed texture itself more compressible, when stored inside a compressed distribution package such as an Android APK or OBB bundle.

This technique replaces zero alpha blocks that are surrounded entirely by other zero alpha blocks with a constant color block, irrespective of the original color values in the input image. This is safe, as the zero alpha means that the original color value is never actually needed, and means the impacted zero-alpha blocks all end up compressing to the same bit pattern in memory.

Original image:

Original sprite sheet with alpha transparency

Original image ignoring transparency:

Original sprite sheet showing extruded opaque edges

Compressed image, ignoring transparency, after swapping out candidate zero alpha blocks with constant color blocks:

Original sprite sheet showing constant color blocks for zero alpha regions

For our previous test texture, replacing a typical opaque edge extrude with constant color blocks reduces the size of a the zipped compressed texture data by up to 20 percent.

Block Size	.astc KB	Old .astc.bz2 KB	New .astc.bz2 KB	Reduction in .bz2 size
4x4	684	602	482	19.9%
5x5	438	386	318	17.5%
6x6	304	270	224	17.1%
8x8	172	144	125	13.2%

The sprite-sheet RDO functionality is automatically enabled for when using the existing -a <radius> option to alpha-weight color encodings.

Download today

To learn more about the ASTC format and how best to use it please check out our ASTC guide.

Find astcenc on GitHub

0 comments
0 members are here

Mobile, Graphics, and Gaming blog

Join the Upscaling Revolution with Arm Accuracy Super Resolution (Arm ASR)

Lisa Sheckleford

With Arm ASR you can easily improve frames per second, enhance visual quality, and prevent thermal throttling for smoother, longer gameplay.
- March 18, 2025
Generative AI in game development

Roberto Lopez Mendez

How is Generative AI (GenAI) technology impacting different areas of game development?
- March 13, 2025
Physics simulation with graph neural networks targeting mobile

Tomas Zilhao Borges

In this blog post, we perform a study of the GNN architecture and the new TF-GNN API and determine whether GNNs are a viable approach for implementing physics simulations.
- February 26, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

astcenc 3.1: high performance texture compression

More speed

Fine-grained compression quality

New features

CPU ISA invariant output

RGBM compression

Sprite sheet transparent “RDO” compression

Download today

Join the Upscaling Revolution with Arm Accuracy Super Resolution (Arm ASR)

Generative AI in game development

Physics simulation with graph neural networks targeting mobile