what can we benefit from single channel texture for performance

I know that single channel texture,like a R8 texture, is smaller compared with ordinary RGB24 or RGBA32 texture with the same resolution,say both are 256x256 pixals. My question is whether it also helps to increase texture cache hit rate for using single channel if possiable. And does the single channel texture compression format well supported in mobile GPU above opengl es 3.0. And besides these, what else can we benefit if using single channel texture.It will be really appreciate if any one can help. Thx. 

  • Hi!

    Texture compression can be applied to single channel textures. Take a look at our ASTC guide which explains the supported formats:
    https://developer.arm.com/documentation/102162/0002/ASTC-format-overview

    You can also read about texture sampling best practices in our Mali Best Practices guide:

    developer.arm.com/.../Texture-sampling-performance

  • Hi Pavel, 

    The benefits here really depend on the format you are using. Mali has a couple of levels of texture cache. The outer levels are purely memory based so the most compact in-memory format will cache better. The inner levels are post-decompression, so how things cache depends on the decompressed format.

    Formats that are explicitly single component (uncompressed R8, R11_EAC) can be decompressed and stored as single component textures, so for these you will get better hit rates in the post-decompression cache. However, they are typically larger in memory than the equivalent ASTC format for the same image quality. Downside is that you get much more limited choices of formats/bitrates, and R11 has lower quality than ASTC.

    ASTC is an odd one here because the format is notionally always 4 component. Image quality for luminance textures is higher, because the compressor can choose more efficient endpoints for luminance blocks, so you can generally beat EAC11 on image quality and bitrate. However when decompressed it is still going to decompress into 4 components because the number of endpoints is a per-block choice and the decompressor cannot guarantee that the whole image is single channel. For ASTC, you'll generally get a better hit rate in the outer caches (because you can choose a lower bitrate for the same quality as the equivalent ETC2 format) but a lower hit rate in the inner-most caches. Note that GPUs are designed to hide misses in the cache, so you probably don't see much practical difference here. The most important thing for ASTC is to make sure you are using the decode_mode extensions where possible to allow decompression into 8-bit per-channel rather than 16-bit (or use sRGB, which is 8-bit by default).

    HTH, 
    Pete