At the moment I get a OOM error when sampling a GL_R32F texture using a sampler2D in a compute shader on a Samsung S6. (It works on Tegra K1). I get this error:
GLES-MALI OOM error: execution failed (gles_drawp_call_finish at hardware/samsung_slsi/MaliT760_r5p0_05dev2_Istor/drivers/product/gles/src/draw/mali_gles_draw_internal.c:143)
Is this a known problem? (Or is the format the root cause of this at all?)
Regards,
Tom
Thanks for reporting - I will raise the engineering team. What sampler state are you using (wrap modes, filtering mode, etc)?
Regards, Pete
Hi,
thanks for you swift reply! I use GL_LINEAR and GL_CLAMP_TO_EDGE. However, as I've investigated this problem further, there seems to be a problem with texture() in compute shaders with all formats.
I've got three cases with different behaviour using texture sampling in compute shaders on the Samsung S6:
* In the case of GL_RGBA8 textures, texelFetch() with sampler2D (treating the sampler2D as a image) works the same as imageLoad on an image2D.
* In the case where I define GL_R32F with glTexStorage2D, texelFetch does not work and gives me a vec2(0.0,0.0,0.0,0.0).
* texture() is even more erratic. In one shader I had a problem when the shader branched, that is, when for half the image the thread returned early (did minimal processing) and stored the input value in an output image2D with imageStore. When I only processed half of the image this way, I got erratic behaviour and stuttering on the parts that were processed and and empty result vec2(0.0,0.0,0.0,0.0) on the individual invocations that returned early. Using texture() on the input sampler2D worked properly when the whole image was processed and all the individual invocations did the same thing,.
Maybe there is something obvious I'm missing here, but as I mentioned, using sampler2D and texture() within compute shaders works as expected on the Tegra K1.
To avoid all these problems I'm forced to use image2D in all compute shaders and implement bilinear interpolation for scaling images using image2Ds.
Thanks - really useful feedback. One final question - how big are the textures in question? Is the behaviour the same irrespective of resolution? (just helps us rule out real OOM issues).
In the cases I've tested the textures are 480 x 97 and I'm trying to scale it using texture to 512 x 128. So hopefully I'm not OOM
Keep up the good work on your good and super strict GLSL ES shader compiler!
Thanks - I'll let know you when we have more info from our side.
Pete
Hi Tom,
> In the case where I define GL_R32F with glTexStorage2D, texelFetch does not work and gives me a vec2(0.0,0.0,0.0,0.0).
Obtaining black is the standard response on Mali to an incomplete texture. The R32F format is not texture filterable in the OpenGL ES core specification (see the OpenGL ES 3.1 Spec, table 8.13), so only GL_NEAREST filtering is supported on Mali for this format type. Even though texelFetch is not filtered, there isn't really an exception from the completeness check for this case in the specification. In fact the completeness check is explicitly required; section 11.1.3.2 Texel Fetches in the 3.1 specification says:
The results of the texel fetch are undefined if any of the following conditions hold: <snip> the texture being accessed is not complete, as defined in section 8.16.
The results of the texel fetch are undefined if any of the following conditions hold:
Hopefully if you swap this out to GL_NEAREST it should then work.
Our best guess is that the Tegra platform is supporting the OES_texture_float_linear extension, which we do not support currently.
> texture() is even more erratic
Is it possible for you to share a cut-down reproducer, or at least the shader which is going wrong? It's hard to tell precisely what you are trying to do here from the description.
Hi Pete,
thanks for the good reference explaining the behaviour of texelFetch. What I referred to as erratic behaviour of sampling with texture() turned out to be related to texelFetches outside of an image's dimensions - so all is solved here! Thanks.
Great, glad you got it sorted.