This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ARM Mali 400 performance analysis using the DS-5 Streamline

Dear ARM forum,

I am using the  DS-5 Streamline to analyze my application performance on ARM MALI 400.

I am seeing that,   GPU vertext processor activity for 3 milliseconds followed by a in-active period of 13 milliseconds then followed by 34 milliseconds GPU pixel proessor activity.

Questions:

1.  I am trying to under stand , why there a so much of in-active period? How can I analyze this period for the performance impact?

2. Streamline has provided many performance measuring events, but there is a very poor documentation on , what is this event capturing and how to make use of it for GPU            
     performance analysis?

3. I want to measure the GPU  Vertex processor  performance in - How many triangles it is processing in one frame, how much time it consumed for that.

                                       GPU  Pixel processor  performance in - How many pixels processed in one frame,how much time it consumed for that.

4. Is there a document to discuss on analyzing all the events for performance analysis.

Thanks,

Ravinder Are

Parents
  • Hi Michael and Peter,

    Did you get chance to look in to the streamline log I shared. 

    I am seeing  in a frame VP has some activity followed by some idle time and followed by PP has some activity.

    Here my questions are,

    1. why VP and PP activity is not parallel, why one after another?

        Am I doing something wrong where the parallelism is not possible?

    2. Why there is a idle time in my application ? why cant PP start immediately?

        I am not using the Vsync, and  I have double buffering in my processing.

    3. I am running QT based OpenGLES2.0 Application with a simple Vertex shader and a simple Fragment Shader.

    I need your support in analyzing .

    Thanks,

    Ravinder Are

Reply
  • Hi Michael and Peter,

    Did you get chance to look in to the streamline log I shared. 

    I am seeing  in a frame VP has some activity followed by some idle time and followed by PP has some activity.

    Here my questions are,

    1. why VP and PP activity is not parallel, why one after another?

        Am I doing something wrong where the parallelism is not possible?

    2. Why there is a idle time in my application ? why cant PP start immediately?

        I am not using the Vsync, and  I have double buffering in my processing.

    3. I am running QT based OpenGLES2.0 Application with a simple Vertex shader and a simple Fragment Shader.

    I need your support in analyzing .

    Thanks,

    Ravinder Are

Children
  • As I said before, Streamline isn't going to help answer questions about idle time. It's a performance profiler - you can't profile "nothing running" - all of the counters are zero.

    The cause of (1) and (2) are probably the same thing. Serialization means idle time and things not overlapping.

    Usual suspects:

    • Window system fences for framebuffers not being released by the previous user of the buffer (normally the display controller or compositor).
    • Vsync and less than 3 framebuffers buffers
    • Application CPU load is too high (e.g. CPU limited) - doesn't look like it in your case
    • Application calling sleep()
    • Application using glFinish, glReadpixels, or waiting on an OpenGL ES level synchronization primitive and draining the rendering pipeline.

    Less likely suspects:

    • Kernel not processing interrupts quickly enough (e.g. another driver is disabling IRQs, and not re-enabling them for a long time).

    HTH,
    Pete

  • Thanks Peter Its useful information