Taking the fear out of silicon debug

Chinese Version 中文版:了却芯片设计的恐慌

The modern SoC is a feat of engineering that continually squeezes greater performance from defined power and area constraints. However the arch nemesis of reliability is complexity.

“Debugging is twice as hard as writingthe code in the first place. Therefore,

if you write the code as cleverly as possible, you are, by definition, not smart

enough to debug it.”

          — Brian W.Kernighan and P. J. Plauger in The Elements of Programming Style

As SoC complexity continues to grow exponentially, it is only wise to build in some advanced debug capability in to the SoC. We’re all familiar with the concept of “a stitch in time saves nine” and this is particularly relevant for debugging; the later you find a bug, the more tedious, time-consuming and expensive it becomes to resolve. Visibility is a precious resource to system designers, as it gives them an opportunity to spot bugs early, and make subtle changes that can alter and optimize an SoC’s performance. On-chip visibility acts as a screening process to identify any snags.


There are certain SoC bugs that tend to manifest themselves through either a data corruption or a system lock up which occurs only when a series of contributing factors align to cause the fault. Factors may be as diverse as manufacturing tolerances being exceeded, bit errors being introduced, complex real-world software exercising new unvalidated spaces, or race conditions between multiple out-of-order transactions.


So if your design does get hit by a rare, extremely difficult to reproduce and tricky to diagnose issue it’s critical you have some tools to deploy to help you get to the bottom of the problem as fast as possible. Almost by definition any bug found in silicon is not going to be found by a simple test case you can run on a simulator or emulator of parts of your design.

Diagnosing the problem


The complexity of multi-core processors and cache coherent interconnects mean much of what was previously visible through CoreSight Embedded Trace Macrocells (ETM), essentially the programmer’s view, is now hidden inside the IP blocks.


With this in mind, ARM has developed a new weapon to add to the CoreSight on-chip debug and trace armoury called the CoreSight™ ELA-500 Embedded Logic Analyzer in order to provide a more accurate diagnosis of system bugs. As the name suggests this is a logic analyzer-like IP block for embedding in to your SoC to monitor up to 12 groups of 128 signals, generate triggers from assertion-like conditions and with a small embedded SRAM to collect a recent trace history of selected signals.

Example debug setup with ELA


So step one is to find out the state your system has got itself in to and what illegal or suspicious condition has occurred. The trace aspect of debug is similar to a detective using CCTV cameras when solving a crime. To help with this the ELA-500 contains a way to set up complex multi-state conditional triggers such as:

  • Trace next 6 write requests plus cache attributes to address 0x12345678
  • Load request from core 0 to address A will advance to trigger_state_1 which
    will then trigger debug mode after core 1 read from address A


The ELA-500 provides a number of tools to discover any malicious conditions:

  • A state machine with 4 trigger states programmable in any sequence including loops
  • Each trigger state can select one of the 12 signal groups as input for trigger conditions
  • Each trigger condition is programmable for comparisons to mask and match any
    combination of 128 signals:  =, !=,  >,  >=, <, <=
  • Each trigger state has a 32-bit counter input to count events, count clock cycles or act
    as watchdog timer
ELA_trigger.png

  Figure1: Trigger set up in the ELA-500


Step 2 is to start looking at what happened around this suspect state or condition; which can be done by storing selected signal states to the ELA-500 dedicated SRAM, configurable between four and over one billion trace data entries, or by triggering another action outside the ELA-500.


Up to 8 programmable output actions that can be triggered for each trigger state, such as: Stop clocks, enter debug state, start/stop signal trace, trigger another logic analyser or ETM, or assert a CPU interrupt.


It is likely that from the information gleaned new trigger conditions will be set to see what other unexpected conditions or states are occurring, so repeating steps 1 and 2 to establish the chain of events leading to the error condition.


For really extreme cases even further visibility may be required around the trigger condition, not visible except through a scan chain dump. For this step 3 is to program a stop clock action on the ELA-500 and then use scan chain dump and information on the SoC’s scan chains to provide exact state or any and all registers within the SoC on a scan chain. The ELA-500 here provides the precision on which scan chain dumps to analyse, so less of this time-consuming exercise needs to be done.


Where to deploy an Embedded Logic Analyzer


The ELA-500 can monitor any signal you connect to its inputs. SoC designers will benefit from connecting up signals from ARM IP and proprietary or third party IP. A typical design might contain multiple ELA-500’s deployed to monitor signals in different domains of the SoC, as shown in figure 2, with one per main processor cluster, one for the Cache Coherent Interconnect and one for other signals selected by the SoC designer.

ELA-500 diagram.png Figure 2: Example deployment of the ELA-500 in a system


Figure 2 shows the clock stop requests (in red) running the Clock Controller from each ELA and the connectivity (in black) of trigger in/out to the CoreSight Cross Trigger Interfaces (CTI) and the Cross Trigger Matrix (CTM). The debug APB bus is used to both set up trigger conditions and to read back the contents of the ELA’s SRAM, as controlled by the debugging tool, such as the ARM® DS-5™ debug tool.

Connecting the ELA-500 to the Cortex-A72 processor


For connection to ARM IP a Logic Analyzer IP Kit (LAK-500A) is provided with a pre-selected set of signals for that IP. The first of these is available for the recently released Cortex®-A72 processor to ensure the ELA-500 can sample signals at the maximum operating frequency of the Cortex-A72 without any impact on the operation of the processor.


The LAK-500A Logic Analyzer IP Kit includes the following:

  • Documented debug signal list and organization into 12 signal groups of 128 debug signals
  • A port puncher script that takes the debug signal list and adds connection to the top level
    ports of the Cortex-A72 processor. The script also has an option to add a register slice to
    debug signals to ensure timing closure
  • A LEC script to ensure nothing but the debug ports changed in the Cortex-A72 processor


The observation interface signals provide debug visibility of: each core-to-L2 interface, power-management interfaces, and the L2 memory system power-management interface. The core-to-L2 interface provides visibility of the physical addresses of L1 misses to the L2, and the following transaction details:

  • Memory type: normal, device, or strongly ordered
  • Read or write
  • Fetches
  • DSB or DMB
  • AArch32 or AArch64
  • L1 set index
  • Byte transfer size
  • Last data received
  • Memory attributes: not shareable, inner shareable, or outer and inner shareable
  • Whether access is from privileged mode
  • Read type: read clean, read unique, icache, data cache, or TLB invalidate
  • Write type: eviction, device, unique, or streaming
  • Eviction has double bit ECC error
  • Signals that determine proper operation of the Load/Store L2 interface.
  • Core snoops,  including cache maintenance Instruction Cache Maintenance Operation
    (ICMO) and TLB Maintenance Operation (TMO)
  • L2 pre-fetch


Future support is planned for new ARM Cortex-A and Mali™ processors as well as the CoreLink™ CCI Cache Coherent Interconnects, where transactions in flight and snoop traffic can be observed.


CoreSight ELA-500 can find corner-case bugs


The CoreSight ELA-500 provides visibility into the states leading to lock-ups and data corruption. It provides visibility of CPU load, stores, speculative fetches, cache activity and transaction lifecycle; properties that are not visible with existing ETM trace of instructions. This offers a greater scope for finding corner-case bugs that could potentially spell disaster if discovered too late.


The ELA-500 can monitor error states and hazard conditions across the SoC, giving visibility to debug lock ups in designs without resorting to complex scan chain dump analysis, and cases with invalid accesses to device memory. The ELA can spot data corruptions early, whereas conventional timeouts occur too late and causation events are often lost/overwritten. I go into even more detail on some of the use cases for the CoreSight ELA-500 in a Video interview with silicon debug expert Mark LaVine


All this ensures you have the fastest debug route available should your SoC suffer a catastrophic failure found only when the silicon comes back and full software is running on the device.

A full specification of the CoreSight ELA-500 can be found on the ARM Infocenter

You can find more information on the CoreSight ELA-500 webpage

Anonymous