This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

GL_EXT_disjoint_timer_query for performance

I have a question about OpenGL ES GL_EXT_disjoint_timer_query extension. I am trying to get performance measurement out of my android app and GL_EXT_disjoint_timer_query does not seems to get me proper numbers. I tried with GL_TIME_ELAPSED_EXT and GL_TIMESTAMP_EXT queries but both gives similar results. Not 0, nor a constant. But tiny fluctuating numbers. I tried with Mali Debugger but could not find any GPU profiling info. I tried to install Streamline DS-5 but it requires a rooted device/compiling gator/eclipse setup which is not possible for me to use at the moment.

What are the units of time returned by GL_EXT_disjoint_timer_query? Extension documentation suggests nanoseconds. To my understanding, glBeginQuery() emits at top of pipeline and glEndQuery() at end of pipleine. So how could GL_TIMESTAMP_EXT even work since OpenGL ES does not indicate at which part of the pipeline it emits the result? Is it possible to have access to some counter?

Any help would be greatly appreciated as this is becoming critical.

Device is a Samsung Galaxy Tab S2 Mali T760

Parents
  • What are the units of time returned by GL_EXT_disjoint_timer_query? Extension documentation suggests nanoseconds.

    Yes, it's nanoseconds.

    To my understanding, glBeginQuery() emits at top of pipeline and glEndQuery() at end of pipleine. So how could GL_TIMESTAMP_EXT even work since OpenGL ES does not indicate at which part of the pipeline it emits the result?

    It's far worse than that, given that OpenGL doesn't even actually guarantee that the hardware looks like the paper pipeline in the specification. Tile-based GPUs like Mali don't even implement the pipeline as a single pipeline. It's two separate decoupled pipelines - one for vertex shading and one for fragment shading. See:

    The Mali GPU: An Abstract Machine, Part 1 - Frame Pipelining

    In general what this means is that you can't use timer queries for timing single drawcalls; they don't exist in isolation in any usable form. From a query point of view all drawcalls in the pass will complete when the last tile in the fragment shading completes. Timer queries can be used with some success for timing single renderpasses, but just be aware that the pipelining of render-passes means that there will be non-trivial error bars.

    Is it possible to have access to some counter?

    DS-5 Streamline is the only public tool we have for accessing hardware counters. Just to be clear these are low level event counters in the hardware, such as cache hits and misses, number of texture operations, etc, rather than timing information.

    Cheers,
    Pete

Reply
  • What are the units of time returned by GL_EXT_disjoint_timer_query? Extension documentation suggests nanoseconds.

    Yes, it's nanoseconds.

    To my understanding, glBeginQuery() emits at top of pipeline and glEndQuery() at end of pipleine. So how could GL_TIMESTAMP_EXT even work since OpenGL ES does not indicate at which part of the pipeline it emits the result?

    It's far worse than that, given that OpenGL doesn't even actually guarantee that the hardware looks like the paper pipeline in the specification. Tile-based GPUs like Mali don't even implement the pipeline as a single pipeline. It's two separate decoupled pipelines - one for vertex shading and one for fragment shading. See:

    The Mali GPU: An Abstract Machine, Part 1 - Frame Pipelining

    In general what this means is that you can't use timer queries for timing single drawcalls; they don't exist in isolation in any usable form. From a query point of view all drawcalls in the pass will complete when the last tile in the fragment shading completes. Timer queries can be used with some success for timing single renderpasses, but just be aware that the pipelining of render-passes means that there will be non-trivial error bars.

    Is it possible to have access to some counter?

    DS-5 Streamline is the only public tool we have for accessing hardware counters. Just to be clear these are low level event counters in the hardware, such as cache hits and misses, number of texture operations, etc, rather than timing information.

    Cheers,
    Pete

Children