There are three primary technologies provided by ARM DS-5 that all Fast Model users should be aware of to get the maximum benefit from the combination of DS-5 and Fast Models.
Fast Model users are aware that DS-5 can be used for debugging software, but performance analysis and trace are less familiar topics for those building and using custom Fast Model systems.
Previously, I covered how to connect DS-5 to a custom Fast Model system for software debugging and using Streamline with Fast Models. This article covers how to use DS-5 software trace when connected to Fast Model targets.
Understanding all three technologies will give the best user experience when developing software on Fast Models and will also carry over to using DS-5 on hardware targets.
Trace is about collecting information about software execution. Instruction trace collects the instructions which were executed by the target CPU, and data trace records memory accesses and register values to help fill out the picture of what happened when instructions were executed. Most information about DS-5 trace assumes a connection to a physical board and a probe like DSTREAM or DSTREAM-ST. A typical setup can collect trace data non-intrusively from a running target using resources built into the hardware such as an ETM (embedded trace macrocell) or PTM (program trace macrocell). Generated trace data is buffered and streamed out via the JTAG connection.
Of course, much of this does not apply to models because they do not model the CoreSight hardware, but it helps to understand the concept of the trace buffer. In hardware, there is a limit on the size of the buffer which collects trace, and the buffer can be “uploaded” using different strategies and viewed in DS-5. The implementation of trace in an SoC is a fairly complex task as each SoC determines the best way to implement trace from a variety of building blocks. The graphic below shows a typical trace implementation.
Much of the debug and trace services layer (DTSL) present in DS-5 can also be used with Fast Models. Software trace was explained for fixed virtual platforms some time ago, but some things have changed.
To demonstrate the process, the same quad-core Cortex-R8 system from the debugging article will be used. It runs bare metal software which has thread support to schedule the running tasks across the four cores.
The diagram from the Fast Models System Canvas is shown below. In addition to the CPU there is memory and a PL011 UART attached to it for input and output.
No special instrumentation is needed to enable trace with models, but some understanding about how it works is useful. DS-5 trace uses a Fast Model plugin to gather trace information. The plugin is provided with DS-5 and located at: [DS-5 install dir]/sw/models/bin/MTS.so
For trace to work, this plugin must be loaded with the --plugin option or via the scx_load_plugin() function for Fast Models. Recall from the article about how to connect DS-5 to Fast Models the choice of either connecting to a running simulation, known as Browse, or having DS-5 start the simulation, known as Launch. When DS-5 launches the simulation it will automatically append the arguments to load the model trace source plugin, MTS.so. If the simulation is started manually, using the Browse method, the command line option must be added or scx_load_plugin() called from the source code of the system.
--plugin $DS5_HOME/sw/models/bin/MTS.so
If the plugin is not loaded an error dialog will open when trying to connect to the running simulation which indicates that trace capture could not be started.
In summary, trace can be collected as long as the plugin is loaded.
The steps for using software trace with Fast Models are not difficult. Currently, only instruction trace is supported. This is no support for data trace. To enable trace go to the DS-5 Debug Configuration which was explained in the debugging article. There is an Edit button on the bottom next to the DTSL Options, click this to get to the trace options.
Set the Trace capture method to Fast Models Trace and set the other options as needed. Since this is a model recommend to set the trace capture buffer to the maximum size of 128 MB.
The best way to use trace is to set breakpoints and then look back over the previous buffer to see what happened. The most recent instructions are at the bottom of the trace window and the oldest are at the top.
The trace view had two main windows, the timeline of function calls and the instruction or function view.
The timeline looks like this:
The instruction view looks like this:
There is a green circle which toggles between instructions (I) or functions (F).
The icons on the top right of the trace window provide features to adjust the zoom, clear the trace, and change the view. It will take some experimentation to get used to them.
One thing which was not obvious to me is the size of trace. With the quad-core Cortex-R8 system and trace set to 128 Mb it always seemed to capture 10,000 instructions per core when the buffer is full. It was not clear where the 10,000 was coming from. It turns out the down triangle on the top right opens a menu with an item called “Set Trace Page Size”. This can be used to make the number of displayed instructions larger, but will use more memory so adjust this as needed to avoid any problems using too much memory.
The trace view will show the instructions executed correlated to the source code (when available). Since Fast Models don’t have any timing except instructions the trace will just show an incrementing count as each instruction is executed.
It’s helpful to just clear the trace if you get confused and use breakpoints to restart and stop it.
The trace is 1 instruction behind current location if you just single step so it’s easy to correlate what is happening.
The data can be exported to a text file using the Export button.
There are various options for the export, but an instruction history can be saved.
In addition to debugging software, DS-5 can collect instruction trace from custom Fast Model systems. The DS-5 trace feature works very much like it does with a hardware target and DSTREAM or DSTREAM-ST. Sometimes an instruction trace might be what is needed to figure out a path through the software or how it ended up in an unexpected place. DS-5 with Fast Models is easy to setup so give it a try and see what can be learned about your software execution.
I recently looked at the exampls in fast models and found some CoreSight related components in FVP_Base_Cortex-A57x1, such as v8EmbeddedCrossTrigger_matrix, dap, etc. If fast models don't model coresight, are there other uses for these components?