This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

About 'Mali External Bus Stalls' chart in Streamline

Hi guys,

May I know more details about 'read stall cycles' and 'write stall cycles' in Mali External Bus Stalls chart in streamline cature?

For my games, I always have write stall cycles significantly larger than read stall cycles.

Can someone tell me more about the Mali External Bus Stalls chart? Even if it's not a profiling issue, I'm curious as an intellectual curiosity :)

Thanks!

Parents
  • I'm using a Mali-G78 based device right now, and the counters described in the Sreamline online documentation seem to be slightly different.

    Check you have the Mali-G78 template applied. It's this menu in the top right of the Timeline view.

    The default view is just alphabetical and doesn't include any derivations. We have an item on the backlog to apply the template automatically, but currently it's a manual step, sorry. The templated counters should match the online documentation - if they don't that's a bug so please let us know =)

    I would like to know how to check the load on the gpu to read a large amount of textures from memory.

    In Streamline, with the Mali-G78 template applied, you can see texture bandwidth from L2 cache in the "Mali Core L2 Memory Reads" chart, and external memory in the "Mali Core External Memory Reads" chart. These specific counters are *per core* numbers, so scale by $MaliConstantsShaderCoreCount if you want a GPU-wide total.

    The overdraw per pixel chart in the performance advisor is a bit odd. Since my app is a game, it is a bit strange that most of the sections are smaller than 1 despite rendering a lot of backgrounds and objects by default. Isn't it normal to always exceed 1?

    Yes, this does seem odd, but without seeing the data it's hard to be sure what's happening. If you're able to export a Streamline capture and share it, feel free to get in touch at mobilestudio@arm.com and I'm happy to take a look. 

    , is it possible to change the threshold at which frame drop occurs or reduce the width? For example, the frame drop period mentioned above is narrow, or the depth of the falling frame is relatively shallow.

    I'm not entirely sure what the question is here, sorry. 

    You'll get dropped frames if you have a single frame (if double buffered) or two consecutive frames (if triple buffered) which are below your target refresh rate. The only way to avoid that is to optimize and optimize some more. There will always be some variation between devices due to thermals and power management that are hard to control, so generally do the best you can. 

    For reporting purposes (e.g. slow frame capture in Performance Advisor) you can set the threshold below which slow frames are captured. Note that Performance Advisor uses a sliding window to average FPS over multiple frames, as triple buffering makes a mess of CPU-side frame timing, so very short transient FPS drops may not get detected as slow frames.

    HTH,
    Pete

Reply
  • I'm using a Mali-G78 based device right now, and the counters described in the Sreamline online documentation seem to be slightly different.

    Check you have the Mali-G78 template applied. It's this menu in the top right of the Timeline view.

    The default view is just alphabetical and doesn't include any derivations. We have an item on the backlog to apply the template automatically, but currently it's a manual step, sorry. The templated counters should match the online documentation - if they don't that's a bug so please let us know =)

    I would like to know how to check the load on the gpu to read a large amount of textures from memory.

    In Streamline, with the Mali-G78 template applied, you can see texture bandwidth from L2 cache in the "Mali Core L2 Memory Reads" chart, and external memory in the "Mali Core External Memory Reads" chart. These specific counters are *per core* numbers, so scale by $MaliConstantsShaderCoreCount if you want a GPU-wide total.

    The overdraw per pixel chart in the performance advisor is a bit odd. Since my app is a game, it is a bit strange that most of the sections are smaller than 1 despite rendering a lot of backgrounds and objects by default. Isn't it normal to always exceed 1?

    Yes, this does seem odd, but without seeing the data it's hard to be sure what's happening. If you're able to export a Streamline capture and share it, feel free to get in touch at mobilestudio@arm.com and I'm happy to take a look. 

    , is it possible to change the threshold at which frame drop occurs or reduce the width? For example, the frame drop period mentioned above is narrow, or the depth of the falling frame is relatively shallow.

    I'm not entirely sure what the question is here, sorry. 

    You'll get dropped frames if you have a single frame (if double buffered) or two consecutive frames (if triple buffered) which are below your target refresh rate. The only way to avoid that is to optimize and optimize some more. There will always be some variation between devices due to thermals and power management that are hard to control, so generally do the best you can. 

    For reporting purposes (e.g. slow frame capture in Performance Advisor) you can set the threshold below which slow frames are captured. Note that Performance Advisor uses a sliding window to average FPS over multiple frames, as triple buffering makes a mess of CPU-side frame timing, so very short transient FPS drops may not get detected as slow frames.

    HTH,
    Pete

Children