At SIGGRAPH last year we announced the release of astcenc 2.0, the first major update to the Arm ASTC texture compressor since the format was announced in 2012. This release gave developers a much needed performance boost, but we knew it was just the starting point on a longer journey. Here we are 12 months and 7 releases later, happy to announce the result of our work: astcenc 3.1.
For most developers, importing and compressing textures is one of the most time-consuming parts of a project build cycle. Fast compressors significantly improve developer efficiency and reduce iteration time, so it is no surprise that slow ASTC compression has been a major bugbear of developers for some time. The main goal of our work on the compressor has been to make the codec as fast as we could. However, we wanted to achieve this without sacrificing the image quality that astcenc provides, as this is one of the real strengths of the ASTC format.
The core codec has been extensively optimized, with nearly every path fine-tuned, and vectorized. Specific vectorized builds are now available for multiple CPU architectures:
The performance of the 3.1 release averages 5 times faster than the 2.0 release, and up to 17 times faster than the 1.7 release. The performance comes at a small image quality loss of around 0.1 dB compared to the 1.7 release for most block sizes. High-bitrate encodings, such as those using the 4x4 block size, actually improve image quality slightly despite the faster performance.
To put this in absolute terms the latest compressor, compiled for AVX2 and running on an Intel core i5-9600K at 4.2GHz, can compress LDR color images with the following performance:
The improvements in this release mean that thorough compression in astcenc 3.1 is around three times faster than medium compression in astcenc 1.7. It is now possible to get both better image quality and measurably faster compression passes. The performance of thorough compression is now good enough that it is feasible to get the best quality out of ASTC in real game builds, which was not possible in 1.7 due to the compression cost.
One of the changes we have added this year is a more fine-grained control over the compressor’s performance-to-quality trade-off. Earlier releases supported only a set of defined preset quality levels (fastest, fast, medium, thorough, and exhaustive), each step increasing the compression cost by between 3 and 5 times. The presets are still supported, but it is now also possible to supply a numeric quality level between 0 (fastest) and 100 (exhaustive), giving developers the ability to finely control their compression cost.
Handy hint: Running astcenc with a compression effort of 30, which is between the -fast (quality 10) and -medium (quality 60) levels, is a close match for the performance and image quality of the ISPC TexComp ASTC texture compressor.
-fast
-medium
While performance is important, it is not the only thing that matters, so we have also added a number of new features to the codec based on common developer requests.
The compressor output is now invariant across CPU architectures and compilers, giving bit-identical output for a given input image and compressor configuration. This gives developers certainty about what their game is going to look like, no matter which build machine was used to run the asset pipeline. It also makes it a lot easier to integrate the compressor into automated test environments which perform bit-exact image comparisons to determine whether a test passes or fails.
We have added support for the RGBM texture format, giving the compressor some awareness of RGBM and how it is consumed. RGBM is a container format which encodes a limited form of HDR data in an LDR wrapper, which can be converted back into HDR values using some shader arithmetic when the texture is consumed.
Handy hint: RGBM encoding is a means to support a limited form of HDR textures on GPUs without a native HDR texture format. All Mali GPUs that support ASTC implement the HDR feature profile, allowing HDR textures to be directly encoded for improved efficiency and image quality.
The traditional HDR-to-RGBM encode is:
// Load HDR inputs in range [0-5] float r_in = pixel_in.r / 5.0f; … // Extract multiplier, rescale RGB to fit range [0-1] float m_enc = max(r_in, g_in, b_in); float r_enc = min(1.0f, r_in / m_enc); …
The traditional shader RGBM-to-HDR reconstruction is:
// Load LDR inputs in range [0-1] vec4 data = texture(…); // Convert back to HDR in range [0-5] data.rgb = data.rgb * data.a * 5.0;
RGBM has historically proven a challenge to compress well with ASTC, as the characteristics of RGBM data break a number of assumptions that the compressor makes about how texture data behaves. In particular, for dark pixels, the format will try to mix large RGB values and small M values. This leaves M prone to quantization during compression, which produces block artifacts caused by M-error-induced luminance shifts, and can often round to zero which results in completely black output blocks.
To solve this issue the original user code that converts HDR data into RGBM, prior to compression with astcenc, should be modified to clamp the value of M above 16. This prevents the use of the very small M values which cannot be encoded reliably.
// Load HDR inputs in range [0-5] float r_in = pixel_in.r / 5.0f; … // Extract multiplier, but limit to >= 16 float m_enc = max(r_in, g_in, b_in, 16); // Rescale RGB to fit range [0-1] float r_enc = min(1.0f, r_in / m_enc); …
The second RGBM-related change is that, during compression, astcenc can now compute the error in the decoded HDR domain rather than the encoded RGBM domain. This allows it to more accurately select block candidates with lower error. Used together these two changes give a much improved result:
Using RGBM with astcenc is therefore a two-step process:
-rgbm <max>
We have added a limited form of rate distortion optimization (RDO) compression for textures with completely transparent regions. The aim of this technique is to make the compressed texture itself more compressible, when stored inside a compressed distribution package such as an Android APK or OBB bundle.
This technique replaces zero alpha blocks that are surrounded entirely by other zero alpha blocks with a constant color block, irrespective of the original color values in the input image. This is safe, as the zero alpha means that the original color value is never actually needed, and means the impacted zero-alpha blocks all end up compressing to the same bit pattern in memory.
Original image:
Original image ignoring transparency:
Compressed image, ignoring transparency, after swapping out candidate zero alpha blocks with constant color blocks:
For our previous test texture, replacing a typical opaque edge extrude with constant color blocks reduces the size of a the zipped compressed texture data by up to 20 percent.
Block Size
.astc KB
Old .astc.bz2 KB
New .astc.bz2 KB
Reduction in .bz2 size
4x4
684
602
482
19.9%
5x5
438
386
318
17.5%
6x6
304
270
224
17.1%
8x8
172
144
125
13.2%
The sprite-sheet RDO functionality is automatically enabled for when using the existing -a <radius> option to alpha-weight color encodings.
-a <radius>
To learn more about the ASTC format and how best to use it please check out our ASTC guide.
[CTAToken URL = "https://github.com/ARM-software/astc-encoder" target="_blank" text="Find astcenc on GitHub" class ="green"]