Benchmarking floating point precision in mobile GPUs - Part III

Chinese version 中文版:基准测试移动 GPU 中的浮点精度 - 第 3 部分

This blog was originally published on 29th August 2013 on

Welcome to Part III of my blog series on GPU floating-point quality! This series was inspired bysome interesting work by Stuart Russell over at Youi Labs™, exploring differences in behaviour of various mobile GPUs. In Part I, I described floating-point formats, and analysed Stuart’s test shader. We used that to tell us how many bits of floating-point precision a GPU’s fragment shader has. In Part II, we explored methods of floating-point rounding, and used the results to find out which GPUs were taking the easy (round-to-zero) way out, and which were using the more accurate round-to-nearest-even. In this last blog, we’ll look at another oddity of floating-point – the hole around zero – and see which mobile GPUs are designed to avoid falling into it.

A key premise of this series is that using floating-point arithmetic is tricky. Floating-point numbers behave a lot, but not exactly, like the numbers you learned about in school. If you ignore the differences, your code will work most of the time; but every so often, it will bite you. If you’re going to use floating-point for anything serious, you’re going to have to learn how it works in more detail than most – how shall I put this? – most normal people would want to. So, we’ll start this journey by talking about what the zero hole is, and how the IEEE-754 specification deals with it. Then, we’ll write a fragment shader that lets us visualize it.

More detail than you really wanted, part 3

Throughout this series, we’ve been using a generic floating-point format similar to IEEE-754’s binary32(which is what most people mean when they say “floating-point”). The format describes a number using a sign bit n, an eight-bit exponent E, and a twenty-four-bit significand (1.sss...). The value of such a number is

(-1)n × 2E × 1.sssssssssssssssssssssss

where E ranges (logically) from -126 to +127. The significand is a fixed-point binary number with 23 bits of fractional precision, and an implicit ‘one’ bit in the one’s place. In the first two blogs, we looked at what happens when you add numbers using this format, especially when those numbers differ in size. Today we’ll look at something much more basic: What is the set of values this format can represent?

To start with the obvious: For each value of E, there is a set of 223 distinct positive values the number can take on. The numbers in each set are uniformly spaced along the real number line, at a distance of 2(E-23)apart. So as E gets smaller and smaller, the numbers get closer and closer together, until E reaches its minimum value of -126. (That, by the way, is almost the defining property of floating-point numbers: the quantization error they introduce is roughly proportional to the size of the number. That means that the error is roughly constant in percentage terms, which is totally awesome. Well, to me, anyway.)

However, if you’ve been paying attention (and weren’t in on the joke already), you may have noticed something strange about the format as I’ve explained it so far: it has no way to represent the number zero! (The 2E term is always greater than zero, and the significand is always between one and two, so the product can never be zero.) We’ve known at least since the time of al-Khwarizmi that zero is a very, very important number; so this just won’t do.

The solution to this little problem also explains another odd feature of our number format. The exponent range is from -126 to +127, which is only 254 values, but our eight-bit exponent can represent 256 values. What are the other two doing? The answer is, they’re serving as flags to indicate values that can’t be represented in the usual way, such as zero, infinity, or the ever-popular NaN (Not a Number). So for example, we can say that when E has the logical value -127, the value of the number is zero.

So far, so good, but look where we’ve ended up. The space between representable numbers gets steadily smaller as E decreases, until we reach the very smallest number we can represent: 2-126, which is

(-1)0 × 2-126 × 1.00000000000000000000000

or roughly 1.175 × 10-38. This is a very small number, but it isn’t zero; in fact, there are an infinite number of real numbers between this number and zero. The next largest number we can represent is

(-1)0 × 2-126 × 1.00000000000000000000001

Notice that the distance between these two numbers is 2-149, which is way smaller than 2-126. Let me put that another way: the distance between zero and the smallest positive number we can represent is eight million times bigger than the distance between that number and the next smallest number. To help visualize what this looks like, imagine a really primitive floating-point format with only four bits of fractional precision in the significand, and a minimum exponent of -4. If we plot the spacing between representable values, it looks like this:

See the huge gap between zero and the smallest representable positive number, compared to the spacing of the numbers above it? That’s the zero hole. With a 24-bit significand, it’s much worse.

Who cares?

OK, so there’s a hole around zero. Does it matter? Well, it depends what you’re using floating-point for; but this blog (like Stuart’s) is all about what happens when you’re doing fairly extreme stuff. And it turns out that if you don’t do something about it, things can get weird. We tend to take it on faith that numbers in a computer behave pretty much like numbers as we learned them in school, so when they don’t, it’s upsetting. As an awkward teenager, I took great comfort from the fact that there were certain truths I could rely on; for example, given two numbers A and B, if A minus B is equal to zero, then A is equal to B. Sadly, with the format we’ve been discussing, that isn’t even close to true.

It was this sort of consideration that led the IEEE-754 committee, after long debate, to require a solution to the problem: denormalized or subnormal numbers. The idea is beautifully simple. We already have a special exponent value to represent zero. Suppose that when the exponent has that value, instead of the usual formula

value = (-1)n × 2E × 1.sssssssssssssssssssssss

we use

value = (-1)n × 2-126 × 0.sssssssssssssssssssssss

For the primitive four-bit format we looked at before, the set of representable values now looks like this:

The zero hole is gone! The space between representable values never increases as we approach zero, A minus B is zero if and only if A equals B, and all’s right with the world!

Hunting for Holes

Of course, filling the zero hole isn’t free; that’s why the IEEE-754 committee argued about it. And for some applications, you can engineer your way around the zero hole. So it’s no surprise that many special-purpose processors (such as DSPs) don’t implement denormal arithmetic. What about GPUs? Can we tell which GPUs follow the IEEE standard fully, and which take the short cut? More importantly, is there a fun way to do it?

Here’s a fragment shader that does something like what we want:

// Denormalization detection shader
precision highp float;
uniform vec2 resolution;
uniform float minexp;
uniform float maxexp;
void main( void ) {

float y = (gl_FragCoord.y / resolution.y) * (maxexp - minexp);
float x = (1.0 - (gl_FragCoord.x / resolution.x));
float row = floor(y) + minexp;
for (float c = 0.0; c < row; c = c + 1.0) x = x / 2.0;
for (float c = 0.0; c < row; c = c + 1.0) x = x * 2.0;
gl_FragColor = vec4(vec3(x), 1.0);
if (x == 0.0) gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
if (fract(y) > 0.9) gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0);


What does this do? If you’ve looked at Stuart Russell’s shader, used in the two previous blogs, this will look familiar. The shader is executed at every pixel on the screen. The first line (variable y) divides the image into horizontal bars, each corresponding to an exponent value in the range from minexp to maxexp. The second line (variable x) computes an intensity value that varies linearly from nearly 1.0 (white) at the left edge of the image, to nearly 0.0 (black) at the right edge. Line 3 determines the exponent corresponding to the bar our pixel is in.

The fun happens in lines 4 and 5. Line 4 divides the grey value by 2 E times, where E is the exponent computed in line 3. Line 5 then multiplies it by 2 E times. In the world of mathematics, this would bring it back to its original value. But floating-point doesn’t work that way. If E is large enough, line 4 will cause the grey value to underflow (go to zero), after which, line 5 won’t be able to restore it.

The last three lines translate the value into a color we can see. Normally, we return the grey value; but if the value underflows to zero, we return red to show that something bad happened. Finally, we draw a thin black line between each bar, to make it easy to count and see which bar we’re in.

The New (de)Normal

So what do we see, when we run this shader on a mobile GPU? Figure 1 below shows the output on an Ascend D1 smartphone using a Vivante GC4000 GPU, and on an iPad 4 using the Imagination SGX 554. Here I’ve set minexp and maxexp so that the exponent runs from -120 to -152, spanning the minimum FP32 exponent of -126. Remember, red regions correspond to numbers that the GPU cannot distinguish from zero. On these GPUs, what we see is that the grey value varies smoothly as long as the floating-point exponent is -126 or greater. When it reaches -127, the value suddenly underflows – it has fallen into the zero hole. These GPUs do not support subnormal values. Any value less than 2-126 is zero, as far as they are concerned.

Figure 2 shows the result on a Nexus 10 tablet, using ARM’s Mali™-T604. Here, instead of maintaining full precision down to 2-126 and then falling suddenly to zero, the grey value maintains full precision to down to 2-126 and then underflows gradually, giving up precision a bit at a time down to a value of 2-149. The Mali-T604 supports subnormals. It can represent an additional eight million non-zero values between zero and 2-126 .

We’ve run this shader on a lot of GPUs, and found that support for denormals is rare. In fact, as far as we know, the Mali Midgard™ series is the only mobile GPU family that offers it. But as GPU computing becomes more and more important, and we move into the world of heterogeneous computing, it will be essential that computations on the GPU give the same results as on the CPU. When that day comes, we’ll be ready – and that day is right around the corner. We’re proud that the Mali-T604 and its successors are taking the lead in offering the highest quality floating-point available in modern GPUs.

What next?

We could have tons more fun investigating floating point behaviour in GPUs. Do they follow IEEE 754 precision requirements for operations? How do they handle NaNs and infinities? (For example, what is the result of a divide by zero?) And how good are they, really, at evaluating transcendental functions? We’re confident that the Mali Midgard GPUs would do very well in that sort of competition; after all, they are the only mobile GPUs that pass the vicious precision requirements of full profile OpenCL. (How vicious? Hint: your desktop CPU with standard C math libraries would fail miserably.)

But these questions will have to wait; three blogs in a row on precision is about all I can stand. There’s lots of other fun stuff going on in the wake of SIGGRAPH - notably the release of Samsung’s new Exynos 5 Octa, featuring a screaming fast Mali-T628 MP6. And there’s some technology news in the works that we’re pretty excited about, starting with the Forward Pixel Kill technique described in Sean Ellis’s recent blog. So we’ll come back to precision one day, but for now, so long, and thanks for all the bits…

Previous blogs in this series:

  • Comments from the original blog post:


    30 August 2013 - 03:30 PM

    This was a very interesting and informative read. Certainly this must be basic information to you, but I found it tremendously helpful to understand the single-precision float format in terms of its limitations a bit more. This will certainly prove invaluable in the future, and I can now appreciate why the same arithmetic operations on different hardware can produce wildly different results! I have heard more than a few times from game developers that Mali is a very comfortable platform to work on, and it seems that the attention spent on implementation detail lends to this verdict.This is slightly off-topic, but I understand that the T6xx GPUs also support compliant double-precision floats in hardware. I do wonder what the benefit of implementing such a format in hardware is. I can only imagine that there is a die-size cost associated with implementing 64-bit floating point ALUs, and as cool as they are (and yes, I believe they are very cool indeed), I can imagine few use-cases that would actually require them for mobile applications (which I'm guessing the most significant market for ARMs GPUs). I suppose my question is this: why implement something in hardware that will be extremely rarely used at the expense of facilities that will be commonly used? I hope this does not sound like a rude question, I am genuinely curious. Perhaps I am overestimating the size of such a feature implementation? Or is it that ARM's GPUs are planned to significantly push past mobile?Thanks again for the article, and any insight would be greatly appreciated.


    10 September 2013 - 02:11 PM
    Hi Sean,Thank you for your question, let me introduce myself. My name is Jakub Lamik and I work as Product Manager for Media Processing Division and look after Mali-T620 series.The double precision overhead is so small, around 1-2%, why wouldn't we build it in from the start in a new architecture? Or a more fundamental question is, why do people believe it would be costly? I think the big mindset difference comes from knowing you are going to do it so spending the time solving the challenges efficiently versus starting with the assumption its costly then go looking for evidence to prove that it is. I see a lot of signs of this going on around the industry with other advanced floating-point features as well like sub-normals and proper rounding modes. Sure, lifted from the textbook they are all quite heavy, but done right they are not.And we knew we needed to do it. These architectures live a lot longer than you think and making fundamental changes along the way can be extremely costly, trust me I've tried. And from the inside we could see what was coming round the corner down the road so having the GPU be fully 64-bit ready was a no-brainer, and once you got that sorted double precision is almost a walk in the park.And we wanted to do it properly. This is about respect for developers. Other providers that are less able will serve you excuses why it’s not for you, and indeed double precision isn't for everything, but it can be a real time saver, not to mention life saver when you really need it. Respecting developers means you choose to stretch a little further to give developers these things, instead of cutting corners and paying lip service to specs. And DP is only one example. Tom's blogs have shown you the rest of the low level floating-point story but there's a lot more here as well and I think perhaps the most telling evidence is the fact that our Midgard architecture Mali GPUs have OpenCL Full Profile conformance. Completing OpenCL Full Profile conformance test is extremely challenging but it's probably the broadest single way you show you've done the work and provided the precision tools people can trust. Again what you need to do it isn't costly in area or power when done right, but the effort, thought and innovation that has to go into the design and verification of architecture, micro-architecture, mathematical algorithms, compilers and drivers to get there requires a level of dedication which I'm incredibly proud to say that we have.Let me know if you have any other questions.Regards,Jakub


    10 September 2013 - 03:12 PM
    Hi Jakub,Thanks for the detailed response, and nice to meet you!I can't really answer your first question because I don't know what it takes to design a chip, but I can try to answer your second: ignorance. It is no mystery that there has been a duel of words between some of the larger mobile GPU vendors (I will not name names) via blogging platforms, and occasionally things may be exaggerated in order to align readers with a particular viewpoint. One such case of this is around DP floats (which I admittedly don't use), where some very particular things were said regarding its usefulness on mobile. I'm sure you are aware of what I'm talking about.Having said that, it is why I'm so grateful for your response. I now understand that the die cost is not as substantial as I assumed, understand (and agree with) the motivation for product longevity from both the perspectives of the vendor and the developer, and as an engineer appreciate meticulous care in platform design -- certainly the decisions made will likely carry forward to subsequent designs. It seems like ARM has done great work with Midgard!Thanks again,