This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is there practical examples of Half-float (FP16) ?

Greetings,

After reading“PHENOMENAL COSMIC POWERS! Itty-bitty living space!” from edplowman, I'm wondering how the FP16 type can actually be used ?

When reading the ARMv7 and ARMv8 architectures manuals, the only instructions that I found to refer to half-precision floating-points are VCVT (ARMv7) or FCVT (ARMv8).

So, my questions are :

Can CPU do anything with half-precision floating points beside converting them ? Can you add/subtract/multiply/divide half-precision floating-points natively ?
How do you use half-precision floating points values efficiently with OpenGL ? Do you do all the operations with single-precision floats and do a conversion before sending the data to the GPU ?
Is there any example showing how to use this data type efficiently ?

Parents

0 daith over 8 years ago

At the moment the most one can do is save space and not worry much about the conversion. They are useful in artificial intelligence type applications and in graphics where quite often a high bandwidth is required but not high accuracy. There is more extensive support for them in graphics units and a later version of ARMv8 will also add support for calculations using them
ARMv8-A architecture evolution
Cancel
Up 0 Down

Cancel

Reply

0 daith over 8 years ago

At the moment the most one can do is save space and not worry much about the conversion. They are useful in artificial intelligence type applications and in graphics where quite often a high bandwidth is required but not high accuracy. There is more extensive support for them in graphics units and a later version of ARMv8 will also add support for calculations using them
ARMv8-A architecture evolution
Cancel
Up 0 Down

Cancel

Children

0 Myy over 8 years ago in reply to daith

Oh, interesting ! I guess that the first ARMv8-A boards will be released during the first quarter 2017 then ?
Meanwhile, I wonder if there's OpenGL examples, compiled for ARM architectures, that use half-float for texture coordinates.
I'd like to try using fp16 for texture coordinates, since fp32 seems overkill. However, I don't know how to define the data type with GCC.
While grep'ing GCC source code, I found the -mfp16-format, the __fp16 type and the float16x4_t type. Should I use those types in data structures containing UV coordinates ?
It seems that GCC only understand the __fp16 type when using the -mfp16-format=ieee option, but this option seem to only work with the armv7 version of the compiler. With the aarch64 it does not.
Cancel
Up 0 Down

Cancel
0 Peter Harris over 8 years ago in reply to Myy

For most graphics use cases you generally don't need to process bulk data at all on the CPU; e.g. vertex attribute data tends to be exported from the content creation tools in fp16 as part of the application build, and then can be copied directly into a vertex buffer object without any processing during level load. By pushing down-conversion to asset creation time, that means you also save download bandwidth and install size, which your users will be grateful for too!
For data you touch regularly on the CPU, such as uniform matrices, it's likely that you're dealing with positional data and need higher precision than fp16 anyway.
Meanwhile, I wonder if there's OpenGL examples, compiled for ARM architectures, that use half-float for texture coordinates.
For most real textures fp16 coordinates are not precise enough, especially on larger screen sizes. You generally want enough precision to cope with (1) texture UV coordinate wrapping for tiled textures and (2) about 16 sub-pixel divisions for good quality filtering - fp16 simply runs out of bits long before that ...
In general for anything related to position (texture coordinates, vertex positions, uniform matrices for position transform, distance computation for lighting, etc) we'd generally recommend using highp/fp32. For anything related to color, or intermediate values which will turn in to a color at some point (such as normals for lighting) then fp16 is probably fine.
Cancel
Up 0 Down

Cancel
0 Myy over 8 years ago in reply to Peter Harris
You generally want enough precision to cope with (1) texture UV coordinate wrapping for tiled textures and (2) about 16 sub-pixel divisions for good quality filtering - fp16 simply runs out of bits long before that ...
Does it affect automatic filtering (GL_SAMPLES, GL_LINEAR_MIPMAP_LINEAR, Anisotropic extensions) or only hand-written filtering algorithms ?
I mean, can the visual quality of some applied textures be improved by just setting precision highp float; instead of precision mediump float; in the fragment shader and sending fp32 coordinates ?
Meanwhile, the basic rule is :
asset used 'as-is' → fp16
asset used in computations → fp32 ?
Cancel
Up 0 Down

Cancel
0 Peter Harris over 8 years ago in reply to Myy

Does it affect automatic filtering (GL_SAMPLES, GL_LINEAR_MIPMAP_LINEAR, Anisotropic extensions) or only hand-written filtering algorithms ?
It will affect everything; it's just a problem with quantization causing less accurate sample points with higher floating point values (as the exponent gets bigger you get fewer and fewer decimal places).
I mean, can the visual quality of some applied textures be improved by just setting precision highp float; instead of precision mediump float; in the fragment shader and sending fp32 coordinates ?
Potentially yes; it depends how the texture is being used (do you have UV wrapping), and on the size of the texture (bigger texture = more pixels to cover with the same 0-1 number range, so effectively less bits per pixel). The driver can help automatically here (we know what inputs are used as texture coordinates), so we can prevent the worst of the issues without the application changing anything.
Cancel
Up 0 Down

Cancel
0 Myy over 8 years ago in reply to Peter Harris

Thanks for these clarifications !
That said, what would be the general best practices for good CPU←→GPU bandwidth usage while retaining enough quality then ?
FP32 (Highp) for close range / high detail assets and (FP16) Mediump for landscape and random filling decoration, I guess ?
Cancel
Up 0 Down

Cancel
0 Peter Harris over 8 years ago in reply to Myy

See previous answer:

In general for anything related to position (texture coordinates, vertex positions, uniform matrices for position transform, distance computation for lighting, etc) we'd generally recommend using highp/fp32. For anything related to color, or intermediate values which will turn in to a color at some point (such as normals for lighting) then fp16 is probably fine.
=)
Cancel
Up 0 Down

Cancel
0 Myy over 8 years ago in reply to Peter Harris

Alright then
Cancel
Up 0 Down

Cancel