This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Mali GPU performance counters query

Dear Team

I came across HWCPipe library to access mali gpu performance counters from ARM. I would like to know if I sample performance counters each millisecond independent of any graphics application as a separate process (Just hwcpipe apis) then is it shows the global/system wide count of that particular counter? How can I know at specific time which graphics process it belongs to when multiple gfx processes are running?

I know if i use HWCPipe apis inside gfx process then I can get it per process counters.

I would also like to understand How the gator daemon dumps the counter per process? If you could provide any pointers then it will be very helpful to understand.

Thank you.

Best Regards,

Vikash

Top replies

Parents

0 Vikash over 4 years ago in reply to Peter Harris

Thank you for your quick reply. Does it means it is global state of counter and may include multiple process usage? Are there any way to get performance counters per process using HWCPipe?
Cancel
Up 0 Down

Cancel

Reply

0 Vikash over 4 years ago in reply to Peter Harris

Thank you for your quick reply. Does it means it is global state of counter and may include multiple process usage? Are there any way to get performance counters per process using HWCPipe?
Cancel
Up 0 Down

Cancel

Children

0 Peter Harris over 4 years ago in reply to Vikash

Yes, it's global. No means to filter to a single process.
Cancel
Up 0 Down

Cancel
0 Vikash over 4 years ago in reply to Peter Harris

So it means my only option is to use the gator daemon to get per process counter information and use ARM DS to visualize it. Could you give some pointer how gator daemon does it?
Cancel
Up 0 Down

Cancel
0 Peter Harris over 4 years ago in reply to Vikash

It doesn't - the counter data is the same. FTrace events (scheduling) do contain process information and can be used to modulate the counter data.
Cancel
Up 0 Down

Cancel
0 Vikash over 4 years ago in reply to Peter Harris

Thank you Peter. Where can I find what kind of ftrace events mali gpu provide? Are the events for gpu task scheduling and context switching?
Cancel
Up 0 Down

Cancel
+1 Peter Harris over 4 years ago in reply to Vikash

Source code for Mali kernel drivers can be found here:

* https://developer.arm.com/tools-and-software/graphics-and-gaming/mali-drivers

HTH,
Pete
Cancel
Up +1 Down

Cancel
0 Vikash over 4 years ago in reply to Peter Harris
Hi Pete,

Thank you for the pointers. I have one more query regarding below derived counter mentioned for Bifrost family.

What is SUM refers here in formula? Does it mean we have to read the counter for certain duration, do the math and report the bytes ? OR How does it work?

5.3.2 L2.EXTERNAL_READ_BYTES (Derived)

Availability: All

With knowledge of the bus width used in the GPU the beat counter can be converted into a raw bandwidth counter.

L2.EXTERNAL_READ_BYTES = SUM(L2.EXTERNAL_READ_BEATS * L2.AXI_WIDTH_BYTES)

Note: Most implementations of a Bifrost GPU use a 128-bit (16 byte) AXI interface, but a 64-bit (8 byte) interface is also possible to reduce the area used by a design. This information can be obtained from your chipset manufacturer.

Thank you.

Best Regards,

Vikash
Cancel
Up 0 Down

Cancel
+1 Peter Harris over 4 years ago in reply to Vikash

You may have multiple parallel L2 cache slices, depending on the number of shader cores in the GPU design. Each slice reports counters separately, so if you have multiple slices you need to add them together to get the total bandwidth for a given sample period.
Cancel
Up +1 Down

Cancel
0 Vikash over 3 years ago in reply to Peter Harris

Hi Peter Harris,

Could you please provide more details here? What does this mean and how does it work? I am checking is such thing is possible using HWCPipe or not.

Which event is responsible for scheduling? I could see following events are available.

drm:drm_vblank_event_delivered
drm:drm_vblank_event_queued
drm:drm_vblank_event
mali:mali_jit_trim
mali:mali_jit_trim_from_region
mali:mali_jit_report_gpu_mem
mali:sysgraph_gpu
mali:sysgraph
mali:mali_jit_report_pressure
mali:mali_jit_report
mali:mali_jit_free
mali:mali_jit_alloc
mali:mali_mmu_page_fault_grow
mali:mali_total_alloc_pages_change
mali:mali_page_fault_insert_pages
mali:mali_pm_status
mali:mali_job_slots_event
power:gpu_frequency
gpu_mem:gpu_mem_total

Best Regards,

Vikash
Cancel
Up 0 Down

Cancel
0 Peter Harris over 3 years ago in reply to Vikash

I'd expect it to be mali:mali_job_slots_event, but not 100% sure.
Cancel
Up 0 Down

Cancel

Mali GPU performance counters query

Top replies

5.3.2 L2.EXTERNAL_READ_BYTES (Derived)