This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

I can't interpret  gpu profiling result on DS-5 Streamline .

Note: This was originally posted on 8th December 2012 at http://forums.arm.com

Hi!
Now I have profling environment for Mali-400MP GPU with DS-5 Streamline.
But I don't have any document for GPU.
I found some doc. that was..

- using_arm_streamline.pdf
- mali_optimization_guid.pdf
- mali_gpu_developer_tools_overview.pdf... etc..

I couldn't find any specific explanation for counters of  Mali GPU. except for counter GPU activity( but I am still confused...)
So. I am asking you for some detailed document about Mali GPU profiling, If you can.
Now I have some profiling result, But I can't do anything..
Please let me make some progress...

Thank you
Daisy.
  • Note: This was originally posted on 10th December 2012 at http://forums.arm.com

    Hi Daisy,

    The "Mali GPU Application Optimization Guide" does contain a section on hardware counters (3.4.3. GPU Counters) which should provide some insight. Can you let me know which counters you 're having trouble with and I'll give an explanation? I'll also check for any more detailed documentation.

    Thanks,
    Chris
  • Note: This was originally posted on 8th January 2013 at http://forums.arm.com

    Hi Daisy:
    Thanks for your kindly answer. It helps me a lot.
    It sounds like that you don't do any modifications of Mali driver.
    So what I need to do is going to download the latest Linaro Mali driver. Reference it and porting it to my platform.
    Thank for your answer again.

    BR.
    Ching.
  • Note: This was originally posted on 17th January 2013 at http://forums.arm.com

    Hi Ching,

    We release the source code to the kernel space components of the driver stack under the GPL license on malideveloper.com, but the source code to the userspace libraries which expose the APIs such as GLES and EGL are only provided to silicon licensees, and as such are only available as prebuilt binaries from your vendor (HardKernel / Samsung in this case). In order to be able to collect streamline profiling data, the userspace libraries must be built to support it, so you will still need to ask your vendor for copies which support streamline profiling. I suggest approaching HardKernel via their forums as they should be able to provide these to you.

    Thanks,
    Chris
  • Note: This was originally posted on 11th December 2012 at http://forums.arm.com

    Hi Chris,

    First of all, thank you for your quick response.

    Here is a doc. named "using_arm_streamline.pdf"
    http://infocenter.arm.com/help/topic/com.arm.doc.dui0482j/DUI0482J_using_arm_streamline.pdf

    My unresolved part  is "5.2.3 Mali-specific events"
    And I also checked "Mali GPU Application Optimization Guide"

    For example,
    "Active cycles"
    "Active cycles, vertex shader"
    "Active cycles, PLBU geometry processing" etc.. for Geometry processor.

    and
    " Active clock cycles"
    " Stall cycles PolygonListReader"
    "Pipeline bubbles cycle count" etc..  for Fragment processor.

    I couldn't find these counters in "Mali GPU Application Optimization Guide".and also anywhere.
    where can I find detailed information for it?

    Please let me know.

    Thank you.
    Daisy.
  • Note: This was originally posted on 14th December 2012 at http://forums.arm.com

    Hi Daisy,

    There is a new version of the Mali GPU ApplicationOptimization Guide currently being created thatwill contain a section on using DS-5 streamline to measure Mali hardwarecounters. It will include a section explaining the various hardware counters,and how to use them to determine bottlenecks in your application.

    As for the ones you have pointed out, I provide thefollowing explanations:

    Geometry Processor:

    • 1. Active cycles: This is the number of cycles perframe that the vertex processor was active.
    • 2. Active cycles, vertex shader: This is the numberof cycles per frame that the vertex shader unit was active. Thisessentially measures the total cycles spent in your vertex shader, and should be roughly (number of vertices * vertexshader cycle count).
    • 3. Active cycles, PLBU geometry processing: This isthe number of cycles per frame that the vertex processor PLBU (Polygon List Builder Unit) was active. This might be high if you are processing too manytriangles, in which case you should consider lowering your triangle count.

    Generally counter 2 is the mostuseful counter, as it gives you a metric to measure the total impact of vertexprocessing for a frame. This is directly impacted by the number of vertices youpass, and the complexity of the shader.

    Fragment Processor:

    • 1. Active clock cycles: The number of clock cyclesthat were active between the start of rendering andthe interrupt raised at the end of rendering.This can be a useful overall counter for the fragment processor, but it is moreimportant to understand where the cycles are being spent, e.g. waiting for thetexture cache or rasterizing a fragment that has already been rasterized once(overdraw).
    • 2. Stall cycles PolygonListReader: This is not generally useful in measuring performance.
    • 3. Pipeline bubbles cycle count: Number of unusedcycles in the fragment shader while rendering is active. This can occur when usinghigh numbers of very small triangles. Insuch cases, it is worth using a "Level Of Detail" system whereby you passgeometry that is always appropriate for the distance from the camera atwhich the object resides. For example, don'tpass 100,000 polygon meshes when the object only occupies 100 pixels, it is better to use a lower polygon model or consider abillboard impostor.

    Here, some of the most usefulcounters are actually:

    "TextureCache Hit/Miss Ratio" which can be calculated by dividing "Texture Cache HitCount" by "Texture Cache Miss Count". A good app will have somewhere in theregion of 5-10:1, where a bad app will have lower than 5:1. In thesesituations, you should consider compessed and/or mip-mapped textures.

    "OverdrawFactor" which can be calculated by: ([Fragment Rasterized Count] * number offragment processors) / (Horizontal Resultion * Vertical Resolution). Typicallya particularly well written application will sit at 2.5 or below, and aparticularly overdraw heavy application will be over 5.

    Please let me know if you have any further questions.

    Chris
  • Note: This was originally posted on 17th December 2012 at http://forums.arm.com

    Hi Chris,

    Thank you for your sincere answer!
    It was exactly what I want to know.

    By the way, when can I see the new version of Mali GPU Application Optimization Guide?
    Now I am planning to have a research of Mali GPU with diverse H/W counters.
    Not only the counters you mentioned, but also other H/W counters.

    When is the new Mali GPU Application Optimization Guide expected to be released?
    Kindly let me know it, if you have any idea.


    Thank you once again.
    Daisy.
  • Note: This was originally posted on 17th December 2012 at http://forums.arm.com

    Hi Daisy,

    Unfortunately I cannot give you an exact date at present, but I will let them know there is a demand. Feel free to ask any questions in the meantime.

    Thanks,
    Chris
  • Note: This was originally posted on 4th January 2013 at http://forums.arm.com

    Hi daisy5050:
    I'm trying get some Mali-400 profiling information from streamline.
    But my platform is Odroid-A and the Mali driver inside is legacy.
    Now I want to porting new Mali driver for it. But I'm not sure which version of the Mali driver is enough.
    Could you give me your Mali driver version number?
    Or your version is just download at Linaro web?
    Thank you very much.

    Hi ARM:
    I'm not sure daisy5050 can see this message or not.
    Could you help me to forward it, thank you.

    BR.
    Ching.
  • Note: This was originally posted on 7th January 2013 at http://forums.arm.com


    Hi daisy5050:
    I'm trying get some Mali-400 profiling information from streamline.
    But my platform is Odroid-A and the Mali driver inside is legacy.
    Now I want to porting new Mali driver for it. But I'm not sure which version of the Mali driver is enough.
    Could you give me your Mali driver version number?
    Or your version is just download at Linaro web?
    Thank you very much.

    Hi ARM:
    I'm not sure daisy5050 can see this message or not.
    Could you help me to forward it, thank you.

    BR.
    Ching.


    Hi Ching.


    Unfortunately, I don't have any idea about the version of Mali driver that I used.
    Odroid-A is based on Exynos4210.
    As I know,  the Android linux  kernel source for the Exynos4210 is from AP vendor like SAMSUNG.
    And a Mali driver inside the kernel is based on linaro.
    But we don't know whether a driver provider(or AP vendor) changed the Mali driver from linaro or not.
    And It should be noted that there is another Mali driver called User space Mali driver in Android source.
    It must be used as a pair with Kernel space Mali driver.

    You can ask ARM for more detailed information.


    Thanks,
    Daisy.