Hi,
It's slightly unclear to me what the L/S cycles reported refer to. Since malioc is not taking into account memory-latency, etc.. are those cycles just related with the number of instructions issued to fetch attribute data and store the pre-interpolated varying results?
e.g.
A LS T Bound Total instruction cycles: 20.60 35.00 0.00 LS Shortest path cycles: 16.60 29.00 0.00 LS Longest path cycles: N/A N/A N/A N/A
Cheers
"Architectural throughput" is just the processing cost of "doing" the instruction. Most of the time the GPU can hide misses and fetch latency - we have other things to run in parallel - so that's all ignored for the purposes of this metric.
For the interpolator costing, the cycle cost here is per fragment so primitive size doesn't matter for these metric (but would for determining total draw call cost - you need to scale these by your screen coverage).
HTH, Pete
Thanks for the clarification Pete! So, for the Midgard case this metric combines a mix of vertex and fragment stage cost: a "fixed" cost for the 3 vertices (in case of a triangle) and a variable cost (coverage-dependent) for the fragment side, correct?