This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Checking bandwidth boundness on Mali-T720

I'm trying to infer if our application is bandwidth-bound (I believe it's not the case) but I don't have the read/write beat counters suggested in Tutorial. Although, I believe the ones I have already "resolve" the beats to the actual bytes value ($MaliL2CacheExtReadsExternalReadBytes and $MaliL2CacheExtWritesExternalWriteBytes).

We seem to be using ~740MB/s of bandwidth but I'm not sure which number to benchmark this against. Presumably the 5GB/s mentioned in the above tutorial (which seems to apply to this GPU) but I was wondering if you could shed some light on this.

Parents
  • Hi JPJ, 

    For Midgard the capture will automatically covert beats into bytes, so you are correct that this has already been scaled. 

    A bandwidth of 740MB/s should be fine for a mobile system; in general GPU bandwidth measurements under 3GB/s should be fine even on an entry-level device. 

    The other thing you can check are the stall counters which the Midgard GPUs expose in addition to the bandwidth counters. Reviewing stalls as a percentage of GPU Active cycles is a good indicator of the memory system struggling. For normal content I'd expect to see stalls < 5% of the GPU Active cycles, although it is worth noting that stalls can happen for a number of reasons, not just high load (e.g. GPU clocked faster than the bus, so the GPU cannot use the bus every clock cycle).

    HTH, 
    Pete

Reply
  • Hi JPJ, 

    For Midgard the capture will automatically covert beats into bytes, so you are correct that this has already been scaled. 

    A bandwidth of 740MB/s should be fine for a mobile system; in general GPU bandwidth measurements under 3GB/s should be fine even on an entry-level device. 

    The other thing you can check are the stall counters which the Midgard GPUs expose in addition to the bandwidth counters. Reviewing stalls as a percentage of GPU Active cycles is a good indicator of the memory system struggling. For normal content I'd expect to see stalls < 5% of the GPU Active cycles, although it is worth noting that stalls can happen for a number of reasons, not just high load (e.g. GPU clocked faster than the bus, so the GPU cannot use the bus every clock cycle).

    HTH, 
    Pete

Children
  • Cheers Pete!
    Sadly, the GPU Active Cycles counter has no data, i.e. always 0, in this device (Samsung J330F). I do get some stall related counters (image below) but I'm not sure if I can use them against anything else. This was measured over 1s btw.

  • A proxy stall rate, in the absence of GPU Active might be (e.g.):

    ($MaliL2CacheExtReadsExternalBusStallsAR / max($MaliJobManagerCyclesJS0Cycles, $MaliJobManagerCyclesJS1Cycles)) * 100

    2.6M stalls over 1 second doesn't look like a problem though - assuming the GPU is 500MHz, that's < 1% stall rate - so it looks happy.

    Cheers, 
    Pete

  • Cheers Pete! Sadly, all the JobManager counters are also missing :/ for this particular device, but it sounds like it's unlikely to be a problem.