Using the CoreSight ELA-500 Embedded Logic Analyzer with Arm DS-5

Introduction

The Arm CoreSight ELA-500 Embedded Logic Analyzer provides low level signal visibility into Arm IP and 3rd party IP. When used with a processor, it provides visibility of load, stores, speculative fetches, cache activity and transaction life cycle, none of which are available through instruction tracing.

CoreSight ELA-500 enables swift hardware assisted debug of otherwise hard-to-trace issues, including data corruption and dead/live locks. As well as accelerating debug cycles during complex IP bring up, it provides extra assistance for post deployment debug.

CoreSight ELA-500 offers on-chip visibility of both Arm and proprietary IP blocks. Trigger conditions can be programmed over standard debug interfaces either directly by an on-chip processor or an external debugger.

This guide is intended to demonstrate how the CoreSight ELA-500 can be used with Arm DS-5 Development Studio to debug a real-world deadlock scenario on a Cortex-A72 + CoreSight ELA-500 based system, caused by a bus transaction hang.

The Problem

One of the most common deadlock scenarios can be caused when a processor initiates memory transactions to a location in the system in which no bus slave exists or the bus slave has limitations such as not being able to handle burst transactions. This type of incomplete transaction can ultimately lead to the processor locking-up (deadlock).

In a perfect world, systems should be designed in such a way that all the entire physical memory map is fully populated. Meaning that all memory transactions, to all addresses, will correctly respond with either a valid transaction result or a bus fault. This said, for certain designs this may not always be the case. The aggressive speculation and prefetching performed by Arm processors mean that these memory map “holes” are more likely to be exposed by incorrect software, even if these memory “holes” are not explicitly referenced by software.

Software can prevent this by correctly configuring the MMU translation tables to accurately describe the physical memory map. Software should configure any memory map “holes” as being Invalid. Configuring the MMU this way will prevent the processor from making any physical bus transactions to that location, and ultimately preventing this type of deadlock scenario.

Debugging these types of deadlock scenarios pose an issue when debugging using traditional methods, such as external debug, and instruction / data trace. A processor core which has locked-up due to an incomplete transaction, will likely not be able to enter halt mode debug. Effectively, the external debugger is unable to break the processor and inspect its internal state. Trace capture may still be available, but will not provide any record of the speculative or prefetched transaction which may be responsible for deadlock.

The Solution

The CoreSight ELA-500 can be used effectively in this scenario to trace the external bus transactions made by the processor (both explicitly and speculatively). This guide intends to showcase the use case scripting capabilities of DS-5 and demonstrate the example CoreSight ELA-500 use case script shipped with Arm DS-5 Development Studio.

NOTE: The scripts required to program the ELA-500 were added to Arm DS-5 in version 5.25. Please ensure this or a later version of DS-5 is installed. 

About the CoreSight ELA-500

The ELA-500 can be implemented with up to 12 Signal Groups, each containing 64, 128, or 256 signals. Which signals are connected to each of the signals in the signal groups will be dependent on the system and the IP that it is connected to. The specific signal interfaces will be documented in the relevant documentation (low level signal description documents like this are typically not publicly available and are made available only to licensees of the Arm IP). Arm IP connected to an ELA will be supplied with a JSON file which documents and annotates the signal group connections for that particular IP, in a machine-readable format. The JSON file can be interpreted by DS-5 to allow seamless debugging of a piece of IP using DS-5 and the ELA.

Signals typically consist of debug signals (status or output), and qualifiers (trigger). Qualifier signals may be required to determine that the debug signal is valid. Debug signals are valid when the qualifier signal(s) are asserted.

The System

For the purposes of this demonstration, the Cortex-A72 + ELA-500 system utilizes the LAK-500A. The LAK-500A is an Integration Kit for the ELA-500, and the Cortex-A72, it is an add-on to the ELA-500. The LAK-500A exposes a number of pre-defined debug observation ports to the Cortex-A72 (Signal Groups), and provides the corresponding JSON signal mapping file. 

As part of the LAK-500A, one of the debug observation ports to the Cortex-A72 exposes the physical read address signal bus “ARADDR” and an address valid signal. “ARVALID”.

NOTE: These signal names have been obfuscated for this blog post. 

These signals are required to determine the read addresses issued by the core, prior to the “lock-up”. Post analysis of these read transactions will help identify which transaction may have caused the fault.

Importing the CoreSight ELA-500 DTSL Use case scripts

  1. Open Arm DS-5 Development Studio
  2. Select ‘File’ –> ‘Import’, Select ‘Existing Projects into Workspace

Import Eclipse Projects

  1. Select the following default directory for the ‘DTSLELA-500’ archive file and click ‘Finish’:

    C:\Program Files\DS-5 v5.xx.x\examples\DTSL_examples.zip
  2. Navigate to the ‘DS-5 Debug’ perspective
  3. Open the ‘Scripts’ window
  4. Right click ‘Use case’, and select ‘Add use case script directory

Adding a use case to the project

  1. Navigate to the DS-5 Workspace location where the ‘DTSLELA-500’ project was extracted above.

Configuring the CoreSight ELA-500 DTSL Use case scripts

Configuration of the ELA-500 can be achieved either by scripting a use case script or using a configuration GUI interface. The application specific use case script allows a user to script a specific debug recipe. The debug recipe would be used to debug a specific debug scenario with the ELA-500. An example of this can be found by navigating to the following use case script:

Scripts window → Use case → DTSLELA-500 → ela_example.py → Configure ELA

For this demonstration, we will use the GUI ELA-500 Configuration Utility to configure the ELA-500 for our specific debug scenario. DS-5 must be connected to the target SoC prior to ELA configuration.

  1. Connect to the target
  2. Open the GUI ELA-500 configuration utility

    The ELA-500 configuration utility can be found by navigating to the following:

    Scripts window → Use case → DTSLELA-500 → ela_lowlevel.py → Configure ELA

    Right click ‘Configure ELA’ and select ‘Configure

Configure ELA

  1. The ELA will need to be configured to start tracing once enabled. This can be achieved by selecting ‘Enable trace’ in the ‘Pre-trigger action’. This will in effect program PTACTION.TRACE so that trace becomes active when the ELA-500 is enabled. When trace is active, trace capture can be controlled to capture on each ELA clock cycle, a trigger Signal Comparison match or a trigger Counter Comparison match.

Configure ELA

  1. Click ‘Apply
  2. We now need to configure our initial trigger “Trigger State 0”. Navigate to the ‘Trigger State 0’ tab in the ELA-500 configuration utility.

    ELA-500 configuration utility

  3. Firstly, we need to select the Signal Group which includes the qualifier signal(s) we wish to trigger on. (In our example  Cortex-72 + ELA-500 + LAK-500A), the “RVALID” signal resides in Signal Group 0. This will be documented in the IPs corresponding JSON file or documentation.

    The ELA-500 uses a “ones hot” encoding for the Signal Group in the Signal Select registers. In this case, Signal Group 0 is selected by programming 0x1 in the ‘Select Signal Group’ field. This will in effect program SIGSEL0 == 0x1 (Trigger State 0 will be associated with the trigger signals in Signal Group 0).

    We also need to program the Signal Comparison condition. In this case, we want to trigger when the “ARVALID” signal is valid (ACTIVE HIGH), so we program ‘Signal Comparison (COMP)' to “Equal”.

    Finally, we need to program the Next state. This is the ELA state we will enter when we meet the trigger condition. In our case we want to capture on each “ARVALID” assertion. Therefore, we program the ‘Next state’ field to 0x1 (ones hot for Trigger state 0). 

  4. Trigger State0’s Signal Compare and Signal Mask value for Signal Group 0 needs to be programmed to monitor the ‘ARVALID’ signal. The bit position of the ‘ARVALID’ signal is documented in the IPs corresponding JSON file or documentation.

    You will need to scroll down to find the entry for the Signal Mask and Signal Compare fields. in our example, ARVALID is mapped to bit 83 so we need to input the [95:64]0x00080000 value for both the ‘Signal Mask’ and ‘Signal Compare’. 

  5. Click ‘Apply ‘ and then ‘Ok’.

Enter Signal Mask and Signal Compare fields

Running the DS-5 ELA use case scripts

  1. Program the ELA configuration registers. This can be achieved by navigating to the following:

    Scripts window → Use case → DTSLELA-500 → ela_lowlevel.py → Configure ELA

    Right click ‘Configure ELA’ and select ‘Run…..

  2. Run the ELA. This can be achieved by navigating to the following:

    Scripts window → Use case → DTSLELA-500 → ela_control.py → Run ELA-500

    Right click ‘Run ELA-500’ and select ‘Run…..
  3. Run the target. The target will run and the ELA will be monitoring the input Signal Group for the trigger conditions. 

Capturing the ELA trace data

  1. In our particular debug scenario, the core is unable to enter halt mode debug.
  2. We can stop the ELA by navigating to

    Scripts window → Use case → DTSLELA-500 → ela_control.py → Stop ELA-500

  3. Right click ‘Stop ELA-500’ and select ‘Run…..’
  4. We can also dump the ELA trace by selecting

    Scripts window → Use case → DTSLELA-500 → ela_example.py → Decode trace data

    NOTE: The 'Decode trace data' script requires the corresponding JSON file to be named 'example_ela_connection.json'. The location of this file can be found in the 'DTSLELA-500'  directory. 

  5. Right click ‘Decode trace data’ and select ‘Configure

    Decode trace data

    Ensure that Signal group 0 is selected for ‘State 0’ and click ‘OK’.

  6. Right click ‘Decode trace data’ and select ‘Run….'

Analyzing the ELA trace capture

The result of the ELA-500 recipe programmed above, means that the ELA will have traced each read transaction and stored them into a circular buffer. This circular buffer will hold X number of read transactions (where X relates to the size of the ELA-500 SRAM and number of signals). These read transactions will have been generated by both explicit reads and speculative reads. Post hang analysis of the read transactions can identify rogue accesses to the potential holes in the memory map. 

The trace capture shows several accesses outside the bounds of the memory copy routine explicitly called. The last address explicitly read by the core was 0x01001fc0. The processor prefetecher continued to read memory from 0x01002000, 0x01002040 and 0x01002080. These memory accesses are to addresses which reside outside of the internal SRAM. These addresses should have been configured in the translation tables as Invalid. This would have prevented the prefetcher from prefeteching from this region of memory.

  Address read valid             = 0x1
  Shareability                   = Inner Shareable
  Execution state                = AARCH64
  Cache Attr                     = Write-back, read/write allocate
  Access size                    = 64 bytes
  Read address                   = 0x01001fc0
  
  Address read valid             = 0x1
  Sharability                    = Inner Shareable
  Execution state                = AARCH64
  Cache Attr                     = Write-back, read/write allocate
  Access size                    = 64 bytes
  Read address                   = 0x01002000

  Address read valid             = 0x1
  Shareability                   = Inner Shareable
  Execution state                = AARCH64
  Cache Attr                     = Write-back, read/write allocate
  Access size                    = 64 bytes
  Read address                   = 0x01002040
  
  Address read valid             = 0x1
  Shareability                   = Inner Shareable
  Execution state                = AARCH64
  Cache Attr                     = Write-back, read/write allocate
  Access size                    = 64 bytes
  Read address                   = 0x01002080

Further Reading

CoreSight ELA-500 Embedded Logic Analyzer

Anonymous