Is this the right place to report problems with the GPU Driver?
I'm an engine developer, just added GLES3 support to my engine and I've noticed that on Android Galaxy Note 4 SM-N910U, ARM Mali-T760 MP6, the alpha tested shadows aren't working correctly.I suspect OpenGL Driver fault, as the same code works fine on iOS, Windows, Mac, Linux, etc.
Shadows that don't use alpha-test (discard) work correctly, but those that use discard, don't display at all.
Shader for alpha-tested materials is below:
Vertex Shader:#version 300 es#ifdef GL_ES#define LP lowp#define MP mediump#define HP highpprecision HP float;precision HP int;#else#define LP#define MP#define HP#endif#if __VERSION__>=300#define attribute in#define varying out#endifvarying vec4 GL_Tex0;varying vec4 GL_Tex1;varying vec4 GL_Tex2;struct VS_PS{vec3 _pos9;vec3 _nrm2;vec2 _tex3;};VS_PS _O1;vec3 _TMP225;vec4 _m0228[3];vec4 _TMP344;attribute vec4 ATTR0;attribute vec4 ATTR3;vec3 _TMP348;vec3 _TMP349;vec3 _TMP350;vec3 _TMP351;vec4 _TMP352;vec4 _TMP353;vec4 _TMP354;vec4 _TMP355;uniform vec4 ProjMatrix[4];uniform vec4 ViewMatrix[180];void main(){vec4 _O_vtx;_m0228[0]=ViewMatrix[(3*gl_InstanceID+0)];_m0228[1]=ViewMatrix[(3*gl_InstanceID+1)];_m0228[2]=ViewMatrix[(3*gl_InstanceID+2)];_TMP348.x=_m0228[0].x;_TMP348.y=_m0228[1].x;_TMP348.z=_m0228[2].x;_TMP349.x=_m0228[0].y;_TMP349.y=_m0228[1].y;_TMP349.z=_m0228[2].y;_TMP350.x=_m0228[0].z;_TMP350.y=_m0228[1].z;_TMP350.z=_m0228[2].z;_TMP351.x=_m0228[0].w;_TMP351.y=_m0228[1].w;_TMP351.z=_m0228[2].w;_TMP225=ATTR0.x*_TMP348+ATTR0.y*_TMP349+ATTR0.z*_TMP350+_TMP351;_O1._pos9=_TMP225;_O1._tex3=ATTR3.xy;_TMP352.x=ProjMatrix[0].x;_TMP352.y=ProjMatrix[1].x;_TMP352.z=ProjMatrix[2].x;_TMP352.w=ProjMatrix[3].x;_TMP353.x=ProjMatrix[0].y;_TMP353.y=ProjMatrix[1].y;_TMP353.z=ProjMatrix[2].y;_TMP353.w=ProjMatrix[3].y;_TMP354.x=ProjMatrix[0].z;_TMP354.y=ProjMatrix[1].z;_TMP354.z=ProjMatrix[2].z;_TMP354.w=ProjMatrix[3].z;_TMP355.x=ProjMatrix[0].w;_TMP355.y=ProjMatrix[1].w;_TMP355.z=ProjMatrix[2].w;_TMP355.w=ProjMatrix[3].w;_TMP344=_TMP225.x*_TMP352+_TMP225.y*_TMP353+_TMP225.z*_TMP354+_TMP355;_O_vtx=_TMP344;GL_Tex1.xyz=_O1._nrm2;gl_Position=_TMP344;GL_Tex2.xy=ATTR3.xy;GL_Tex0.xyz=_TMP225;}
#version 300 es
#ifdef GL_ES
#define LP lowp
#define MP mediump
#define HP highp
precision HP float;
precision HP int;
#else
#define LP
#define MP
#define HP
#endif
#if __VERSION__>=300
#define attribute in
#define varying out
varying vec4 GL_Tex0;
varying vec4 GL_Tex1;
varying vec4 GL_Tex2;
struct VS_PS{
vec3 _pos9;
vec3 _nrm2;
vec2 _tex3;
};
VS_PS _O1;
vec3 _TMP225;
vec4 _m0228[3];
vec4 _TMP344;
attribute vec4 ATTR0;
attribute vec4 ATTR3;
vec3 _TMP348;
vec3 _TMP349;
vec3 _TMP350;
vec3 _TMP351;
vec4 _TMP352;
vec4 _TMP353;
vec4 _TMP354;
vec4 _TMP355;
uniform vec4 ProjMatrix[4];
uniform vec4 ViewMatrix[180];
void main()
{
vec4 _O_vtx;
_m0228[0]=ViewMatrix[(3*gl_InstanceID+0)];
_m0228[1]=ViewMatrix[(3*gl_InstanceID+1)];
_m0228[2]=ViewMatrix[(3*gl_InstanceID+2)];
_TMP348.x=_m0228[0].x;
_TMP348.y=_m0228[1].x;
_TMP348.z=_m0228[2].x;
_TMP349.x=_m0228[0].y;
_TMP349.y=_m0228[1].y;
_TMP349.z=_m0228[2].y;
_TMP350.x=_m0228[0].z;
_TMP350.y=_m0228[1].z;
_TMP350.z=_m0228[2].z;
_TMP351.x=_m0228[0].w;
_TMP351.y=_m0228[1].w;
_TMP351.z=_m0228[2].w;
_TMP225=ATTR0.x*_TMP348+ATTR0.y*_TMP349+ATTR0.z*_TMP350+_TMP351;
_O1._pos9=_TMP225;
_O1._tex3=ATTR3.xy;
_TMP352.x=ProjMatrix[0].x;
_TMP352.y=ProjMatrix[1].x;
_TMP352.z=ProjMatrix[2].x;
_TMP352.w=ProjMatrix[3].x;
_TMP353.x=ProjMatrix[0].y;
_TMP353.y=ProjMatrix[1].y;
_TMP353.z=ProjMatrix[2].y;
_TMP353.w=ProjMatrix[3].y;
_TMP354.x=ProjMatrix[0].z;
_TMP354.y=ProjMatrix[1].z;
_TMP354.z=ProjMatrix[2].z;
_TMP354.w=ProjMatrix[3].z;
_TMP355.x=ProjMatrix[0].w;
_TMP355.y=ProjMatrix[1].w;
_TMP355.z=ProjMatrix[2].w;
_TMP355.w=ProjMatrix[3].w;
_TMP344=_TMP225.x*_TMP352+_TMP225.y*_TMP353+_TMP225.z*_TMP354+_TMP355;
_O_vtx=_TMP344;
GL_Tex1.xyz=_O1._nrm2;
gl_Position=_TMP344;
GL_Tex2.xy=ATTR3.xy;
GL_Tex0.xyz=_TMP225;
}
Pixel Shader:#version 300 es#extension GL_EXT_shader_texture_lod:enable#extension GL_EXT_shadow_samplers:enable#ifdef GL_ES#define LP lowp#define MP mediump#define HP highpprecision HP float;precision HP int;precision HP sampler2D;#if __VERSION__<300#define gl_InstanceID 0#endif#else#define LP#define MP#define HP#endif#if __VERSION__>=300#define texture2D texture#define varying in#else#endifvarying vec4 GL_Tex2;struct MaterialClass{vec4 _color;vec4 _ambient_specular;vec4 _sss_glow_rough_bump;vec4 _texscale_detscale_detpower_reflect;};float _c0079;uniform MaterialClass Material;uniform sampler2D Col;void main(){_c0079=texture2D(Col,GL_Tex2.xy).w+(false?float(Material._color.w)*5.00000000E-001-1.00000000E+000:float((Material._color.w-1.00000000E+000)));if(_c0079<0.00000000E+000){discard;}}
#extension GL_EXT_shader_texture_lod:enable
#extension GL_EXT_shadow_samplers:enable
precision HP sampler2D;
#if __VERSION__<300
#define gl_InstanceID 0
#define texture2D texture
#define varying in
struct MaterialClass{
vec4 _color;
vec4 _ambient_specular;
vec4 _sss_glow_rough_bump;
vec4 _texscale_detscale_detpower_reflect;
float _c0079;
uniform MaterialClass Material;
uniform sampler2D Col;
_c0079=texture2D(Col,GL_Tex2.xy).w+(false?float(Material._color.w)*5.00000000E-001-1.00000000E+000:float((Material._color.w-1.00000000E+000)));
if(_c0079<0.00000000E+000){
discard;
Expected result from Windows:
What I'm getting on Galaxy Note 4:
The tree model is composed of 2 materials (trunk that has no "discard", and the leaves that use the "discard" shader). Both shaders are shadow shaders, don't output any color, their only purpose is write to the depth buffer.
Here is the link to the APK that you can test by yourself:
www.dropbox.com/.../Application 3D.7z
Thank you Daniele and Peter for your replies.
I'm glad to know that the issue with the shader has already been addressed in a newer version of the driver.
However I'm surprised with the black lines that you're seeing, as I don't have that kind of problem on my Note4.
My driver version is v1.r7p0-03rel0.e941a8
I've checked my app with Mali Graphics Debugger, however all the errors are related to:
-failure to create a certain texture format (such as BC7, BGRA) in that case I simply fallback to RGBA texture
-problem when setting anisotropic filtering, as there's no GL_TEXTURE_MAX_ANISOTROPY defined for GLES3 headers, I've made my own #define GL_TEXTURE_MAX_ANISOTROPY 0x84FE to match GLES2 and desktop GL. However Mali Debugger doesn't recognize this enum. What happened to anisotropic filtering in GLES3? Did it disappear?
Anyway, those problems shouldn't cause the black lines.
And regarding the depth texture as render target and shader input, even if it's bound to some shader input, I'm not reading and writing at the same time to it.
As for the FBO's and glClear, I choose to call 'glInvalidateFramebuffer' instead of glClear, because glClear is not free on some platforms. And I'd like to have a one code path for multiple platforms. If 'glInvalidateFramebuffer' is called at the start of rendering to an FBO, instead of glClear, is it not enough? I've did some performance checks, and speed was similar when I used glInvalidateFramebuffer instead of glClear.
Sometimes I don't need to clear the memory, because I will overwrite it with some shader at the start, and since glClear is not free on some other platforms, I assumed it's better to just call glInvalidateFramebuffer.
Hi Esenthel,
Yes, using glInvalidateFramebuffer at the beginning each render pass will have the same effect as glClear which basically avoids loading back the framebuffer that you are not going to use anyway. glClear would have made the code a bit cleaner and avoid the multiple api calls to setup the frame buffer for each render pass. Performance wise its the same to use glClear of glInvalidateFramebuffer.
For the depth texture read you are right, I haven't realized you were not writing into it at the same time. The issue I see happens only on the transparent objects (the leaves in you example) of both the main scene and the shadow show the lines.
This happens on a Firefly board which has a Mali T-760 MP4 similar to the one in the Note 4 but with new drivers.
I believe the issue is caused by an out-of-spec behavior of your API calls. Specifically, the 6th render-pass (the last one rendering to an off-screen buffer), binds a texture to read from it in a shader but also uses it as a COLOR_ATTACHMENT0 for the framebuffer you are currently writing (all color masks set to true). This is out-of-spec and can be the cause of the issue.
Since you are doing deferred shading I suggest to have a look at the Pixel Local Storage extension for the devices which supports it. That will allow to implement your algorithm more efficiently. You can find various documents around about how to use it.
Cheers,DDD
Thank you very much for this helpful information.
I was reading and writing to the same pixel so I assumed it won't be a problem, as it worked fine on DX9/desktop GL, and many mobile devices.
However I've disabled reading and writing to the same color Render Target, I now output the result to another temporary Render Target.
Could you let me know if you still see the black lines over there? https://www.dropbox.com/s/k2im25s16ku7m5l/Application%203.7z?dl=0
Thank you,
Greg
Esenthel said:What happened to anisotropic filtering in GLES3? Did it disappear?
It's never been part of the OpenGL ES specification; it's only available via extensions on the platforms which support it.