In December, we launched an early-access version of our new frame-based profiling tool, Frame Advisor, part of the Arm Performance Studio suite of tools, available for free. We asked you, the developer community to help us by evaluating this tool and sharing your ideas with us. Since then, we have been listening intently to the feedback and continuing to develop Frame Advisor with new features being introduced in regular 8-weekly releases. You can read about the most recent changes here.
In this blog post, we cover our vision for Frame Advisor over the coming year. We detail a few ideas we have on the roadmap to enhance the kinds of analyses you can perform. As always, we welcome your input, and you can influence our roadmap by giving us feedback through our feedback form.
Today, Frame Advisor lets you capture 1 frame burst of up to 3 consecutive frames. You can then analyze these frames using 3 methods:
We prioritized these features first, because we wanted Frame Advisor to address our biggest profiling challenge – to meaningfully present per-draw performance data for tile-based GPUs. This can be difficult, or at least different to desktop and console GPUs. The reason is they process geometry for each render pass first, and then perform fragment shading for each tile. Work related to a single draw call gets split up between these 2 phases. The fragment work will be interleaved with other draw calls depending on where geometry ended up on screen. So, it is not useful to look at how long a draw call took to complete, because draws are not processed sequentially by the GPU.
Instead, Frame Advisor proposes an alternative methodology. The tool captures the API calls made by the application and uses our knowledge of how the GPU works to evaluate the efficiency of the workloads sent to the GPU. Using this analysis, the tool can help you to identify specific gaps in the application. Such as inefficient use of render passes, inefficient draw culling, or poor geometry index and vertex buffer encoding.
Frame Advisor is designed to help you easily identify inefficiencies in a single frame of content, so that you can optimize using best practices for tile-based GPUs. You can see a demo of Frame Advisor in this video, or read this learning path tutorial to understand how to use the tool to analyze game content.
The 1.1 release, which is part of Arm Performance Advisor 2024.0, integrates the improvements based on the first round of feedback from developers.
The main highlight of this is the simplified Vertex Shader Efficiency and Vertex Memory Efficiency metrics, which consolidate a set of underlying geometry efficiency metrics into 2 overarching metrics. This simplifies the review workflow to just the absolute size of your models and these 2 efficiency metrics. You only need to dive into the detailed metrics if you see a problem, and the detailed metrics each signpost a specific problem to solve.
Look out for our next release which includes the first major new view, the Mesh view. This view provides an interactive visualization of the selected object’s mesh, which is immediately useful in identifying what object the current draw call is rendering. When used alongside the Framebuffers view, you will be able to tell immediately if the object is too complex for its size and position on-screen.
The near-term roadmap includes a set of updates for the Framebuffers view to support visualization of HDR floating-point image formats, which will also provide more user control for depth visualization. This will include support for the additional formats and interactive visualization control over the range that is mapped to the visible on-screen image for presentation.
The current Render Graph only analyzes render passes, but applications are making increasing use of compute and transfer workloads in their rendering. Later in the year we will look at adding more workloads into the Render Graph view, ensuring that the whole frame can be represented by the workload analysis.
The next major new analysis items on our short-term development roadmap are a set of new or updated views related to shader analysis.
Shaders are obviously critically important for graphics performance and, if you are already familiar with other Arm profiling tools, you may have already tried Mali Offline Compiler. This tool is a command-line tool used to perform static analysis on a single shader program, giving feedback on key performance indicators.
Example extract from a Mali Offline Compiler report:
Main shader =========== Work registers: 64 (100% used at 50% occupancy) Uniform registers: 10 (7.5% used) Stack spilling: 32 bytes 16-bit arithmetic: 75% A LS V T Bound Total instruction cycles: 4.70 64.60 0.03 0.00 LS Shortest path cycles: 0.47 19.00 0.03 0.00 LS Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture
We plan to integrate support for Mali Offline Compiler metrics into Frame Advisor. You will be able to view shader source code and metrics for the shader programs used for each draw call, directly from the graphical user interface.
In addition, we will use the vertex attribute stream feedback from the compiler to add awareness of the split position/non-position vertex shaders that are common in tile-based rendering. This will result in an even more useful bandwidth analysis for the Vertex Memory Efficiency metric.
We also plan to reintroduce the Shader Map capture mode, which exists in the older Graphics Analyzer tool. This allows us to color pixels in the Framebuffer view based on the shader program that is used.
If you have already tried Frame Advisor, you will know that the Analysis screen tries to present a lot of data, and this is only going to grow as we add more types of analyses. In the current tool the layout of the Analysis screen is configurable, using a docking system to allow you to drag, drop, and resize views where you want them. Where the views are best placed depends on what kind of analysis you are doing, and of course, personal preference. There is no one-size-fits-all approach.
To aid usability, we plan to support pre-defined Analysis screen layouts for different types of analysis and allow you to create custom layouts that you can save and reuse.
Would it not be great if Frame Advisor could tell you where the problems are? Even better, if you could set your own budget for acceptable cost thresholds, and the tool automatically alerts you when that budget is broken?
As the name implies, the end-goal of Frame Advisor is that the tool can provide specific advice and feedback, directly highlighting efficiency issues, best practice violations, and user-specified budget violations. This will make it easier for you to consume the output of the tool, reducing the time needed to optimize your application.
The current functionality in the tool performs a lot of underlying analysis, and the visualizations and metrics are designed to make it easy to identify issues. However, the review process is still relatively manual, and we believe we can do more to highlight the problems and signpost the solutions.
Frame Advisor is a new tool, and very much still in development. This blog is not exhaustive but provides a very quick overview of the features we are planning over the next 12 months. Beyond the ideas presented here, we have ideas for texture analysis, render state analysis, API linting, user content budget linting to name a few. It is fair to say that we are not short on ideas.
This comes with the caveat that, like all roadmaps, this may change as we get more feedback from developers. Help us prioritize – let us know what problems you are hitting and want tools can help you solve them efficiently.
If you are at GDC this week, please visit us at the Arm booth (#S1357) where we will be running lightning talks and demos on all the Arm Performance Studio tools. See Frame Advisor in action and tell us what you think. Please bring your questions, suggestions and feedback on the tool. You can try it out today by downloading Arm Performance Studio.
Download Arm Performance Studio