<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="https://community.arm.com/utility/feedstylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Confusion about the cycles in streamline</title><link>https://community.arm.com/developer/tools-software/graphics/f/discussions/47980/confusion-about-the-cycles-in-streamline</link><description> In streamline, multiple counters use cycles，e.g. MaliGPUCyclesGPUActive，MaliGPUCyclesNonFragmentQueueActive，MaliGPUCyclesFragmentQueueActive，MaliCoreTextureCyclesTexturingActive . 
 In my understanding, these cycles describe the time consumption on the</description><dc:language>en-US</dc:language><generator>Telligent Community 10</generator><item><title>RE: Confusion about the cycles in streamline</title><link>https://community.arm.com/thread/168461?ContentTypeID=1</link><pubDate>Thu, 05 Nov 2020 08:16:16 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:f9b6278d-98f9-4b3e-b49d-9ee9697f6137</guid><dc:creator>Peter Harris</dc:creator><description>&lt;p&gt;Hi Shawn, &amp;nbsp; The GPU has many parallel queues and pipelines. Many of the counters show &amp;quot;something was running&amp;quot; in a particular&amp;nbsp;queue, others show actual unit utilization of a particular block in the hardware. &amp;nbsp;&lt;/p&gt;
&lt;p&gt;Queues can contain multiple parallel units. And parallel units can be used concurrently by multiple queues, so in general things *don&amp;#39;t*&amp;nbsp;sum together in any meaningful way.&amp;nbsp; &amp;nbsp;I&amp;#39;d start by reading though one of our counter guides, the diagrams help explain the hierarchy a little more. For example, for Mali-G77: &amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://developer.arm.com/ip-products/graphics-and-multimedia/mali-gpus/mali-performance-counters/mali-g77-counters"&gt;&lt;span class="ui-webpreview" data-configuration="url=https%3A%2F%2Fdeveloper.arm.com%2Fip-products%2Fgraphics-and-multimedia%2Fmali-gpus%2Fmali-performance-counters%2Fmali-g77-counters"&gt;&lt;img src="/cfs-filesystemfile/__key/communityserver-components-imagefileviewer/filetypeimages_2E00_/unknown.png_2D00_1265x50.png?_=637335937504988743" border="0" alt="" style="max-height: 1265px;max-width: 50px;" /&gt;&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;nbsp;If my understanding is correct, how is the cost of specific units such as texture units calculated? For example, there are two texture units working at the same time, and each consumes 1 cycle. &amp;nbsp;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Shader core counters are shown &amp;quot;per core&amp;quot;, so they are frequency normalized for the target GPU. You don&amp;#39;t need to worry about shader core count - you can just compare with frequency. Similarly, if a shader core has e.g. two arithmetic pipelines, counters are only shown for pipeline zero. Workload across the parallel units will be ~equal, and showing only unit zero also means that the data is implicitly normalized and can be compared directly with frequency. &amp;nbsp; &amp;nbsp;&lt;br /&gt;&lt;br /&gt;HTH,&amp;nbsp;&lt;br /&gt;Pete&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item></channel></rss>