This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Shadows with alpha test (discard) not working on Galaxy Note 4 SM-N910U, ARM Mali-T760 MP6

Is this the right place to report problems with the GPU Driver?

I'm an engine developer, just added GLES3 support to my engine and I've noticed that on Android Galaxy Note 4 SM-N910U, ARM Mali-T760 MP6, the alpha tested shadows aren't working correctly.
I suspect OpenGL Driver fault, as the same code works fine on iOS, Windows, Mac, Linux, etc.

Shadows that don't use alpha-test (discard) work correctly, but those that use discard, don't display at all.

Shader for alpha-tested materials is below:

Vertex Shader:
#version 300 es
#ifdef GL_ES
#define LP lowp
#define MP mediump
#define HP highp
precision HP float;
precision HP int;
#else
#define LP
#define MP
#define HP
#endif
#if __VERSION__>=300
#define attribute in
#define varying out
#endif
varying vec4 GL_Tex0;
varying vec4 GL_Tex1;
varying vec4 GL_Tex2;
struct VS_PS{
vec3 _pos9;
vec3 _nrm2;
vec2 _tex3;
};
VS_PS _O1;
vec3 _TMP225;
vec4 _m0228[3];
vec4 _TMP344;
attribute vec4 ATTR0;
attribute vec4 ATTR3;
vec3 _TMP348;
vec3 _TMP349;
vec3 _TMP350;
vec3 _TMP351;
vec4 _TMP352;
vec4 _TMP353;
vec4 _TMP354;
vec4 _TMP355;
uniform vec4 ProjMatrix[4];
uniform vec4 ViewMatrix[180];
void main()
{
vec4 _O_vtx;
_m0228[0]=ViewMatrix[(3*gl_InstanceID+0)];
_m0228[1]=ViewMatrix[(3*gl_InstanceID+1)];
_m0228[2]=ViewMatrix[(3*gl_InstanceID+2)];
_TMP348.x=_m0228[0].x;
_TMP348.y=_m0228[1].x;
_TMP348.z=_m0228[2].x;
_TMP349.x=_m0228[0].y;
_TMP349.y=_m0228[1].y;
_TMP349.z=_m0228[2].y;
_TMP350.x=_m0228[0].z;
_TMP350.y=_m0228[1].z;
_TMP350.z=_m0228[2].z;
_TMP351.x=_m0228[0].w;
_TMP351.y=_m0228[1].w;
_TMP351.z=_m0228[2].w;
_TMP225=ATTR0.x*_TMP348+ATTR0.y*_TMP349+ATTR0.z*_TMP350+_TMP351;
_O1._pos9=_TMP225;
_O1._tex3=ATTR3.xy;
_TMP352.x=ProjMatrix[0].x;
_TMP352.y=ProjMatrix[1].x;
_TMP352.z=ProjMatrix[2].x;
_TMP352.w=ProjMatrix[3].x;
_TMP353.x=ProjMatrix[0].y;
_TMP353.y=ProjMatrix[1].y;
_TMP353.z=ProjMatrix[2].y;
_TMP353.w=ProjMatrix[3].y;
_TMP354.x=ProjMatrix[0].z;
_TMP354.y=ProjMatrix[1].z;
_TMP354.z=ProjMatrix[2].z;
_TMP354.w=ProjMatrix[3].z;
_TMP355.x=ProjMatrix[0].w;
_TMP355.y=ProjMatrix[1].w;
_TMP355.z=ProjMatrix[2].w;
_TMP355.w=ProjMatrix[3].w;
_TMP344=_TMP225.x*_TMP352+_TMP225.y*_TMP353+_TMP225.z*_TMP354+_TMP355;
_O_vtx=_TMP344;
GL_Tex1.xyz=_O1._nrm2;
gl_Position=_TMP344;
GL_Tex2.xy=ATTR3.xy;
GL_Tex0.xyz=_TMP225;
}


Pixel Shader:
#version 300 es
#extension GL_EXT_shader_texture_lod:enable
#extension GL_EXT_shadow_samplers:enable
#ifdef GL_ES
#define LP lowp
#define MP mediump
#define HP highp
precision HP float;
precision HP int;
precision HP sampler2D;
#if __VERSION__<300
#define gl_InstanceID 0
#endif
#else
#define LP
#define MP
#define HP
#endif
#if __VERSION__>=300
#define texture2D texture
#define varying in
#else
#endif
varying vec4 GL_Tex2;
struct MaterialClass{
vec4 _color;
vec4 _ambient_specular;
vec4 _sss_glow_rough_bump;
vec4 _texscale_detscale_detpower_reflect;
};
float _c0079;
uniform MaterialClass Material;
uniform sampler2D Col;
void main()
{
_c0079=texture2D(Col,GL_Tex2.xy).w+(false?float(Material._color.w)*5.00000000E-001-1.00000000E+000:float((Material._color.w-1.00000000E+000)));
if(_c0079<0.00000000E+000){
discard;
}
}

Expected result from Windows:

What I'm getting on Galaxy Note 4:

The tree model is composed of 2 materials (trunk that has no "discard", and the leaves that use the "discard" shader). Both shaders are shadow shaders, don't output any color, their only purpose is write to the depth buffer.

Here is the link to the APK that you can test by yourself:

www.dropbox.com/.../Application 3D.7z

Parents
  • Hi Esenthel,

    Yes, using glInvalidateFramebuffer at the beginning each render pass will have the same effect as glClear which basically avoids loading back the framebuffer that you are not going to use anyway. glClear would have made the code a bit cleaner and avoid the multiple api calls to setup the frame buffer for each render pass. Performance wise its the same to use glClear of glInvalidateFramebuffer.

    For the depth texture read you are right, I haven't realized you were not writing into it at the same time. The issue I see happens only on the transparent objects (the leaves in you example) of both the main scene and the shadow show the lines.

    This happens on a Firefly board which has a Mali T-760 MP4 similar to the one in the Note 4 but with new drivers.

    I believe the issue is caused by an out-of-spec behavior of your API calls. Specifically, the 6th render-pass (the last one rendering to an off-screen buffer), binds a texture to read from it in a shader but also uses it as a COLOR_ATTACHMENT0 for the framebuffer you are currently writing (all color masks set to true). This is out-of-spec and can be the cause of the issue.

    Since you are doing deferred shading I suggest to have a look at the Pixel Local Storage extension for the devices which supports it. That will allow to implement your algorithm more efficiently. You can find various documents around about how to use it.

    Cheers,
    DDD

Reply
  • Hi Esenthel,

    Yes, using glInvalidateFramebuffer at the beginning each render pass will have the same effect as glClear which basically avoids loading back the framebuffer that you are not going to use anyway. glClear would have made the code a bit cleaner and avoid the multiple api calls to setup the frame buffer for each render pass. Performance wise its the same to use glClear of glInvalidateFramebuffer.

    For the depth texture read you are right, I haven't realized you were not writing into it at the same time. The issue I see happens only on the transparent objects (the leaves in you example) of both the main scene and the shadow show the lines.

    This happens on a Firefly board which has a Mali T-760 MP4 similar to the one in the Note 4 but with new drivers.

    I believe the issue is caused by an out-of-spec behavior of your API calls. Specifically, the 6th render-pass (the last one rendering to an off-screen buffer), binds a texture to read from it in a shader but also uses it as a COLOR_ATTACHMENT0 for the framebuffer you are currently writing (all color masks set to true). This is out-of-spec and can be the cause of the issue.

    Since you are doing deferred shading I suggest to have a look at the Pixel Local Storage extension for the devices which supports it. That will allow to implement your algorithm more efficiently. You can find various documents around about how to use it.

    Cheers,
    DDD

Children