I have a question about OpenGL ES GL_EXT_disjoint_timer_query extension. I am trying to get performance measurement out of my android app and GL_EXT_disjoint_timer_query does not seems to get me proper numbers. I tried with GL_TIME_ELAPSED_EXT and GL_TIMESTAMP_EXT queries but both gives similar results. Not 0, nor a constant. But tiny fluctuating numbers. I tried with Mali Debugger but could not find any GPU profiling info. I tried to install Streamline DS-5 but it requires a rooted device/compiling gator/eclipse setup which is not possible for me to use at the moment.
What are the units of time returned by GL_EXT_disjoint_timer_query? Extension documentation suggests nanoseconds. To my understanding, glBeginQuery() emits at top of pipeline and glEndQuery() at end of pipleine. So how could GL_TIMESTAMP_EXT even work since OpenGL ES does not indicate at which part of the pipeline it emits the result? Is it possible to have access to some counter?
Any help would be greatly appreciated as this is becoming critical.
Device is a Samsung Galaxy Tab S2 Mali T760
Thanks for the response and sorry for the long delay.
Can you point me to the proper setup DS-5 Streamline? My device is an off the shelf one. I cannot root it nor compile kernel modules or change kernel images. Is it possible to launch the .apk through DS-5 Streamline and not have to port my build process to Eclipse (not trivial tasks)?
Out of curiosity, how many frames of latency is there? Do you treat the geometry for a full frame? I assume not since this problem is unbounded in memory. So how big is the intermediate buffer?
Can you point me to the proper setup DS-5 Streamline?
DS-5 Community Edition – DS-5 Development Studio – ARM Developer
I cannot root it nor compile kernel modules or change kernel images.
Unfortunately the current versions of DS-5 Streamline require a kernel module to capture the data.
Out of curiosity, how many frames of latency is there?
It really depends on the application and operating system. If the application is hitting vsync then the pipeline length from application to screen is normally 3 frames (triple buffering), the pipeline length from geometry processing to fragment processing is 0-1 frames depending how close to 100% load the GPU needs to hit vsync at the current operating frequency.
Do you treat the geometry for a full frame?
No it's all per render pass (e.g. per application FBO or per default FBO).
So how big is the intermediate buffer?
No fixed size - Mali just uses system memory for all GPU resources.
HTH, Pete