This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mali 450 precision issue: how to detect odd and even columns

Hi,

I'm working on an algorithm where I need to detect if the current `gl_FragCoord` is an even or odd row/column. A typical implementation would look something like `if (mod(gl_FragCoord.x, 2.0) < 1.0) { ... } else { ... }`.  After running into issues with this approach a quick google search pointed me to quite some good information:

Especially this forum post is pretty much identical to the issue I'm looking into.

I've created this repository to experiment with this issue and to solve it. I'm using a MiBox MDZ-16-AB.  This repository is created for Android Studio. It will setup a basic custom `GLSurfaceView` that uses the `TestGlView` that instantiates the `TestGlRenderer` class. In `TestGlRenderer` I create a simple filter shader that applies the `if (mod(..))` logic to draw different colors for odd/even columns.  I create a FBO with a size of 3840 x 2160 to detect floating point precision issues. When rendering into a FBO with a color attachment of a texture with a size of 1920 x 1080 the issues is a lot less.

In the image below (3840 x 2160) you can clearly see the issue. It should show vertical red and black lines from left to right. 

When rendering using a 1920 x 1080 things get a little bit better, but still not 100% correct.

In `TestGlRenderer` I create an instance of `GlRenderToTexture` which is just a thin wrapper that create a FBO with one texture attachment. In `onSurfaceCreated()` of the `TestGlRenderer` class I create an instance of `GlRenderToTexture`. In the code I've added (a commented) version which creates either a 1920x1080 or 3840x2160 FBO. 

Now I'm curious what would be a workaround or solution to be able to distinguish between odd and even rows/columns?

Thanks

  • The Mali-400 series programmable core only supports FP16 precision in the arithmetic units, so anything using arithmetic to do this won't have enough precision.

    There is a higher precision path between the varying unit and the texturing unit. I'd recommend creating a texture-based lookup table which is the width of the framebuffer and 1 pixel high which alternates black and white pixels. Use a varying lookup to index into the texture using GL_NEAREST filtering, and then use the returned color to determine odd or even.

    Obviously this gets harder with non-fullscreen geometry, or geometry which moves, but I don't think we have a general purpose solution to this on the Mali-400 series once framebuffers get above a certain size.

    Hope that helps, 
    Pete

  • Thanks for your reply and great suggestion. This was one of the two possible solutions I was thinking about; the other solution I was thinking about was creating some sort of tiled based rendering where I render / execute the fragment shader only on portions of the framebuffer; but I did not research if this is a sound solution yet. Using a texture lookup with 1 pixel height seems to be the best alternative. 

    Thanks

  • If you go for tiled framebuffers, for FP16 you get 10 bits of mantissa so you should only be able to accurately address 2^10 = 1024 pixel wide framebuffers using gl_FragCoord.

  • I'm experimenting with this using the public test repository I created but the results I get are the same.  At L23 I set the `highp` precision.  and at L42 I fetch the pixels from the lookup table. The width of the lookup table is the same as the viewport/fbo into which I render (3840x2160). 

    This is the result when I render into 1920 x 1080, which looks correct:

    But when I render into a 3840 x 2160 and then render that to the default framebuffer, I get something similar to the above issue:

      

    It still seems like a precision issue; but if the texture lookups are done using a highp variant.

  • I assume you are downsampling from 2160p (offscreen) to 1080p (onscreen), so the downsample isn't going to be stable with GL_NEAREST filtering (you have twice as many texels as you have pixels, so the choice of black or white is going to be a bit unstable).

    To verify that the 2160p render is doing the right thing I'd suggest a glReadPixels call just to snapshot the memory - it removes one moving part from the sequence.

  • Ah offcourse. Totally makes sense. The reason I was using GL_NEAREST was that GL_LINEAR would give me a grayish output. But you are totally right, I should use glReadPixels to download the result. Thanks and sorry for this silly mistake.

  • Great, I can confirm that using a lookup texture works. 

    I've got another related question: let's say that I have a fragment shader that I use when rendering into a texture 3840x2160 with an FBO.  In this shader, I also want to sample from another texture of a size 1920x1080.  But, I want to fetch the pixels around the current fragcoord taking into account the scaling.

    So let's say that gl_FragCoord is 1920x1080 (so half of the FBO attachment texture), this means that my texture coordinate that has a value of (0.5, 0.5). When I want to sample the corresponding pixel from my smaller texture (the one of 1920x1080) I can simply use that texture coordinate. Using this would fetch me the pixel at (1920 * 0.5, 1080 * 0.5) => (960 x 540).  But what if I want to fetch one pixel to the left and one pixel to the right of this one. So (959 x 540) and (961 x 540).  I'm asking because I suspect that using arithmic in the fragment shader won't allow me to do this because of the low precision. Am I right?  

  • If you try to do this with arithmetic in the shader code, then you are correct. As soon as you do any arithmetic on a variable you'll drop to fp16 precision. As a workaround you could simply create multiple texture coordinate varyings, each offset by one pixel's worth of space, and pass each directly into a texture call.

    For a complex mesh that's going to be expensive in terms of varying memory bandwidth, but for a simple full screen quad I wouldn't expect that to be a major problem in terms of performance.

  • Thanks for the suggestion.  Would that mean that in this case I would need 3 varyings? One for sampling the center pixel, one for the left and one for the right?  And do you mean that I would have to upload these varyings as vertex attributes? I'm wondering how the varyings for the left and right lookup could be constructed as a varying.

  • Yes, that's the idea.

    For normal texel centers your varyings are set to range from 0 to 1 across the triangle, which gives you samples at texel centers. If you know your texture is 1024 wide then you simply need to adjust the varyings to be offset by 1/1024 (subtract that to shift a sample left, and add it to shift a sample right).