Hello OpenGL gurus,I have written a OpenGL ES 3.1 app for mobile devices and I am battling with problems on ARM Mali GPU. I can reproduce the problem easily using a Samsung Galaxy S7 phone equipped with a Mali T880 GPU. The program appears to run correctly on Adreno and PowerVR GPUs.One frame is composed of several render passes. The render passes communicate with help of a Shader Storage Buffer Object and atomic counters. The whole thing looks like this:
Pass1_Initialize_SSBO_and_Atomic(); glMemoryBarrier(GL_ALL_BARRIER_BITS); Pass2_Fill_SSBO_With_initial_Data(); glMemoryBarrier(GL_ALL_BARRIER_BITS); for(i=0;i<N;i++) { Pass3_Render_Object(i); glMemoryBarrier(GL_ALL_BARRIER_BITS); } Pass4_Compose_Everything();
Now, the problem is that on Mali the screen keeps flashing. I have made many recordings and watched them frame-by-frame. What happens is that about 95% of frames look correct, but every so often an arbitrary subset of Objects disappears, and reappears in the next frame. Sometimes (very rarely) I can also see in Android's debug facility (logcat) the following:
E/OpenGLRenderer: Error:glFinish::execution failed E/OpenGLRenderer: GL error: Out of memory!
I've seen that a few times before and so far this meant that some shader runs couldn't finish (due to an infinite loop) but I have no proof that this is the same issue.################################################## ######The problem is, I have no idea what can be causing the disappearing Objects. What I've tried so far is to keep removing code to see if the bug is still there - in an attempt to come up with the shortest piece of code that reproduces the problem. This approach fails because the more code I remove, the harder the bug gets to reproduce. Initially it keeps happening about twice per second, after several passes of removing various bits I can only reproduce it once per minute, and ultimately I cannot reproduce it anymore, but I have no idea if this is because I just removed the offending code or because I just passed some threshold and the bug is still there but is now very hard to reproduce.The second thing I tried is to measure the bug by taking a look at the SSBO. I memory-map it to CPU at certain moments between passes and make sure it really does contain what it should. Unfortunately as soon as I add a
glBindBuffer(GL_SHADER_STORAGE_BUFFER, mSSBO[0] ); glMapBufferRange(GL_SHADER_STORAGE_BUFFER, 0, length, GL_MAP_READ_BIT); (...) // print the buffer glUnmapBuffer(GL_SHADER_STORAGE_BUFFER); glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);
pretty much anywhere in the application code, the bug simply disappears. In particular, when I remove the first glMemoryBarrier() and replace it with the above, the bug disappears completely ( I recorded 10 minutes worth of screen and watched this frame-by-frame, it's gone). This happens even if I don't inspect the buffer on CPU at all, I just map it and unmap it right away (which AFAIK should have the same effect like a memoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) ??) This has raised a suspicion that maybe glMemoryBarrier() on Mali is buggy, so I wrote a test program to see - and this proved that glMemoryBarrier() works just fine. I also have a Mali Graphics Debugger and I can connect it to the phone, and when I do the bug still shows. I however have no idea what to look for in the Debugger's interface.Would you have any advice how to approach such an issue?The code is GPL v2 and it is fully available to download - is it 45k lines of Java, XML and GLSL though. If anybody wants to take a look, here it is: http://distorted.org/redmine/project...e_example_code
I have also asked this question in the generic OpenGL forum here:
https://www.opengl.org/discussion_boards/showthread.php/200754-Flashes-on-ARM-Mali?p=1291723
and one seasoned member over there says it might be a 'full pipeline flush' that my program somehow triggers.
Any advice on how I can use the Mali Graphics Debugger (or or any other tool) ? I have it connected, the phone is rooted, I just don't know what to look for in MGD's interface...
EDIT: Sorry, I was wrong, it is actually even worse:
1) I run the 'bug reproducing' app, flashes happen regularly about once-twice per second.
2) I connect MGD, start tracing, flashes completely stop.
3) I press the 'disable tracing temporarily' button in MGD - flashes immediately start again.
4) I press the button again to restart tracing - flashes completely disappear...
Hi Utumno,
Apologies for taking so long to reply to this, I have been following your threads on both the OpenGL forum and Stack Overflow and I'm glad to hear from reading the latest OpenGL forum post that you've managed to fix this even if it isn't an ideal solution.
I have looked at your linked code on Distorted.org and noticed your commit for a Mali specific test but that you then removed this in a following commit. It would be great if you had a minimal reproducer that would allow me to investigate this issue further, the test that you used to check the behaviour across the different devices mentioned in your penultimate post on the OpenGL forum sounds perfect and either the APK or the source code for this would be fantastic.
I hope to hear back from you soon,
Ryan
Hello Ryan!
Actually I spoke to soon in the OpenGL forum - the main 'bug reproducing' app (the one with 3 coloured cubes) is now 'fixed' (i.e. the issue is now not reproducible anymore in this app) but a few other example apps still exhibit it, albeit to a lesser degree. So I cannot claim to have a fix.
Do you need source code or just a precompiled APK? I could do both, APK would be easy, source - I am afraid that the minimal reproducer still stands at about 3-4k lines of code....
Hi Utumno!
In that case it's definitely worth investigating further.
Both the source and an APK would be best!
Many thanks,
Ok, I'll give it to you, hopefully sometime today-tomorrow
Hello Ryan,
I have prepared a (first version of) app which reproduces the flashing issues.
The app shows 3 textured cubes. When run on a Mali T880 phone, it will sometimes flash (i.e. one or more cube will disappear for 1 frame). I am testing this on a Samsung Galaxy S7 running Android 8.1.0-based LineageOS 15.1, but I strongly suspect the same happens with the original Android 7.0.0-based Samsung OS. The flashes happen irregularrly, sometimes many seconds can pass between two; if you keep watching for 30 seconds, you definitely should see some.
The app behaves correctly on a HTC Desire 12 (PowerVR GE8100) or LG Nexus 5X (Adreno 418).
APK:
1. On your phone, go to Settings->Security and allow installation from unknown sources
2. Fire up your browser, go to distorted.org/.../mali-debug.apk [Removed by website]
3. Install
SOURCE:
1. Fire up Android Studio
2. In the 'Welcome to Android Studio' popup, choose 'Check out existing project from Repository'
3. Choose 'git' , in the Repository URL - distorted.org/git/mali-debug.git
4. Clone, Open, compile, install. it's configured for SDK API 27 but I guess anything >=24 should do.
The source is, at the moment, quite large and complicated - 3 files in the 'malidebug' directory (the app itself) and 30 files in the 'library' directory - which are just parts of the original 'distorted-library' graphics library I have copied over.
Total: 7000 lines. It should be possible to cut that down quite a bit - I'll try that next. But like I said in the OpenGL forum, curiously, the more code I remove the less reproducible the issue becomes (the original 'Triblur' app from the 'distorted-examples' repo keeps flashing about once-twice per second)
Before I start simplifying it, maybe we'll first see if you can reproduce the issue though?
I have been able to recreate the flashing on an S7 running the r12 driver but I cannot recreate this on the r22 driver, therefore this flickering behaviour you have been seeing was a driver bug that has since been fixed. This newer driver version is available in the Android 8.0 update for the S7 and I would suggest updating to this version.
Additionally if you cannot update, adding a glFinish before you bind the framebuffer in the setAsOutput method in the DistortedOutputSurface.java file also appears to stop the flickering although this is not the best solution due to the impact on performance.
I hope this helps,
Great, thanks a lot Ryan!
One more question. If OpenGL vendor is ARM and driver version is <22, I'd like to insert a workaround. I know how to detect that the vendor is ARM ( glGetString(GL_VENDOR) ) but how do I detect the version of the driver?
On my Samsung Galaxy S7, glGetString(GL_VERSION) returns
OpenGL ES 3.2 v1.r12p1-03dev0.228ab63cced004f840e7dd47b762a1d0
and I guess I could parse out the 'r12' from that, but this feels a bit shaky. Would GL_VERSION always be structured identically? Is there a better way?
Parsing GL_VERSION for the driver version is the only way to get the current driver version. The driver version should always be structured as rXXpX although the rest of the structure is not guaranteed.