Checking bandwidth boundness on Mali-T720

I'm trying to infer if our application is bandwidth-bound (I believe it's not the case) but I don't have the read/write beat counters suggested in Tutorial. Although, I believe the ones I have already "resolve" the beats to the actual bytes value ($MaliL2CacheExtReadsExternalReadBytes and $MaliL2CacheExtWritesExternalWriteBytes).

We seem to be using ~740MB/s of bandwidth but I'm not sure which number to benchmark this against. Presumably the 5GB/s mentioned in the above tutorial (which seems to apply to this GPU) but I was wondering if you could shed some light on this.

Parents
  • Hi JPJ, 

    For Midgard the capture will automatically covert beats into bytes, so you are correct that this has already been scaled. 

    A bandwidth of 740MB/s should be fine for a mobile system; in general GPU bandwidth measurements under 3GB/s should be fine even on an entry-level device. 

    The other thing you can check are the stall counters which the Midgard GPUs expose in addition to the bandwidth counters. Reviewing stalls as a percentage of GPU Active cycles is a good indicator of the memory system struggling. For normal content I'd expect to see stalls < 5% of the GPU Active cycles, although it is worth noting that stalls can happen for a number of reasons, not just high load (e.g. GPU clocked faster than the bus, so the GPU cannot use the bus every clock cycle).

    HTH, 
    Pete

Reply
  • Hi JPJ, 

    For Midgard the capture will automatically covert beats into bytes, so you are correct that this has already been scaled. 

    A bandwidth of 740MB/s should be fine for a mobile system; in general GPU bandwidth measurements under 3GB/s should be fine even on an entry-level device. 

    The other thing you can check are the stall counters which the Midgard GPUs expose in addition to the bandwidth counters. Reviewing stalls as a percentage of GPU Active cycles is a good indicator of the memory system struggling. For normal content I'd expect to see stalls < 5% of the GPU Active cycles, although it is worth noting that stalls can happen for a number of reasons, not just high load (e.g. GPU clocked faster than the bus, so the GPU cannot use the bus every clock cycle).

    HTH, 
    Pete

Children
More questions in this forum