Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.

We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.

Thank you for your understanding.


This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[Peformance Counter Quesiton] what does EXEC_STARVE_ARITH mean?

Hi, support

We're profiling the performance of our application and found that EXEC_STARVE_ARITH is high (30%  of EE core utilization)

We check the meaning  is "The number of cycles where the processing unit is starved of work."

Does it mean there is 30% time that EE core has no work to do ???

If it's true, it's weird because it should have many warps parallel to help this issue.

Thanks for help

Parents
  • This counter can be hard to interpret - which GPU are you using?

    Possible causes for a high value:

    • The content is bottlenecked on another unit (e.g. load/store, varying interpolation, texturing), so there simply isn't enough arithmetic workload to keep the arithmetic pipeline busy every cycle. 
    • The content is stalling in another unit because of e.g. descriptor or data cache misses. GPUs are good at hiding misses, but if you get a lot of them close together it may not be possible to hide the latency of memory fetch completely. 
    • The content is stalling on the instruction cache (check the EXEC_ICACHE_MISS counter).

    Kind regards, 

    Pete

Reply
  • This counter can be hard to interpret - which GPU are you using?

    Possible causes for a high value:

    • The content is bottlenecked on another unit (e.g. load/store, varying interpolation, texturing), so there simply isn't enough arithmetic workload to keep the arithmetic pipeline busy every cycle. 
    • The content is stalling in another unit because of e.g. descriptor or data cache misses. GPUs are good at hiding misses, but if you get a lot of them close together it may not be possible to hide the latency of memory fetch completely. 
    • The content is stalling on the instruction cache (check the EXEC_ICACHE_MISS counter).

    Kind regards, 

    Pete

Children
No data