How to debug: CoreSight basics (Part 2)

July 6, 2015

7 minute read time.

I'm doing a series of blogs that give a technical introduction into ARM CoreSight debug and trace technology. If you missed the first part, you can find it here: How to debug: CoreSight basics (Part 1)

Please let me know if you have any comments or questions, I'll be happy to address them in the comments section below.

Processor Trace Architectures

The ETM and PTM trace units are trace sources that monitor ARM processors. Each ETM trace unit and PTM trace unit is associated with certain processor lines, and each ETM and PTM implementation conforms to certain ETM and PTM architectures. The architecture consists of a generic programmers model and a trace protocol.

ETMv1, ETMv2

The earliest ETM architectures, representing internal processor pipeline status in a cycle by cycle basis. No longer in common use.

ETMv3

Major revision to earlier protocols, implementing a byte-based packet protocol and the first ETM protocol to support CoreSight. Supports instruction by instruction execution and data transfer trace, depending on the processor.

PFTv1

Derived from ETMv3, providing only trace of branch execution and exceptions. Supported by Cortex-A9, Cortex-A12 and Cortex-A15

ETMv4

A major revision of the earlier protocols, supporting advanced processor architectures. Includes the instruction execution trace style of PFTv1, and optionally ETMv3 style data trace capabilities. Supported by Cortex-R7, Cortex-A53 and Cortex-A57.

Within a CoreSight system, any processor trace units supporting ETMv3, PFTv1 or ETMv4 architectures can operate in combination.

Most processor trace units provide a single ATB output bus (either 8 bit for the Cortex-M variants, or 32 bit). This carries both instruction trace, and data trace if supported. Some R-class processor trace units are unusual in providing a 32 bit ATB interface for instruction trace and a 64 bit ATB interface for data trace. This reflects the high cost of implementing data trace for a high performance processor, and also the need within some real-time application segments to support high-quality data trace capture.

Debug access and DAP topology

Traditional SoC debug used a JTAG interface to connect to a TAP controller in the processor. Where multiple processors are present, the JTAG scan chain would cascade the TAP controller of each processor, possibly through multiple clock and power domains.

Access to system memory would be achieved by halting the processor and downloading instructions while halted to cause the processor to perform the necessary memory accesses.

The DAP introduced by the CoreSight architecture moves the primary point of connection away from the individual processor, and implements a bridge between the external protocol and various different on-chip protocols. This provides a flexible and scalable solution where this bridge point can remain powered and responsive irrespective of the activity of individual processors.

Figure 1 shows a view of the components which are visible in the debug memory mapped space with their discovery registers. Registers provide identification and address offset details. Remember that the DAP will be multiplexed with accesses from the main system interconnect too.

Figure 1

Debug Port

Every DAP requires a Debug Port (DP). This is the master device, and implements the external interface. Debug ports supporting both JTAG and optimized 2-pin Serial Wire interface can be licensed from ARM.

The debug port provides:

always-on connection for the debugger
debug fault and status reporting
power and reset request interface

Debug port accesses from the external debugger are performed as 32 bit (word) read or write transactions, targeting either DP registers, or Access Port (AP) registers. Multiple Debug ports (usually in multiple packages) can be addressed from a single external debug agent using:

daisy chained JTAG scan chain
star topology JTAG scan chain
multi-drop serial wire

Access Port

Each DAP contains between 1 and 256 Access Ports (APs). The APs are controlled by the DP in response to external commands. Most APs implement a master port which interfaces to an on-chip standard bus interface. Memory APs exist for memory-mapped interfaces such as APB, AHB and AXI interconnects. A JTAG-AP can be used to interface the DAP to a traditional JTAG TAP controller. Customized access ports can also provide a simple interface to dedicated chip-level debug logic.

Memory APs provide the following features:

Target address register
Read or write to target address
Bus error reporting
Transaction in progress status
Address incrementor (to accelerate block read/write operations)
Access control mechanisms
Information about connected debug components
Perform access appearing as system master, or external debug agent.

DAP Address Space

Any individual memory mapped address in system memory might require several accesses to enable the correct path, and requires more than simply the target address in the on-chip memory map:

DP Identifier: The debug agent might support concurrent access to more than one DAP.
AP Select: The target AP must be selected by writing to a register in the DP.
TAR Select: The target address must be set by writing to a register in the AP. Each AP can have a unique view of some or all of the memory mapped components in the target system.
Data Access: Once all the addresses necessary for a DAP access to the system are set, a request to the AP can initiate the on-chip access as either a read or a write.
Read Data retrieval: Although the on-chip access will now proceed, the debugger must perform another access to the DAP in order to retrieve the data value. This need not result in a second on-chip access.

When an access fails for some reason, the debugger is able to identify the failure. Usually the debugger can re-try the access and recover from simple errors on the interface.

Debug Memory Map Views

Both externally hosted debug agents and on-chip debug agents (for example a debug monitor) require access to debug components. Within CoreSight, these debug components are provided on a dedicated bus, the debug APB. This ensures a clear separation between system memory space and debug memory space. An exception is the Cortex-M processors where a shared AHB interconnect supports both system memory and debug access as an area-reduction trade-off.

An on-chip agent must first navigate the system memory bus before being multiplexed with the DAP initiated transactions on the Debug APB. This provides two memory mapped views, one from the external debugger and one from the on-chip agent. Both views share access to the debug components using the same address offsets within the mapped regions. The system view of the debug APB will typically have a non-zero base address whilst the external debugger view uses a base address of zero.

The upper address bit (PADDRDBG31) is only accessible from the external debugger and serves as an access control mechanism.

Debug Memory Discovery and ROM Table Entries

Every CoreSight component with an APB memory map occupies one or more 4kB blocks of memory. Within this block, CoreSight defines the content of some discovery registers. You can see the CoreSight TRM on ARM Infocenter for each individual component for specific details. The discovery pointer structure is shown in Figure 1 above, some examples of the individual registers are shown in Table 2 below.

Name/Offset

Example Values

Description

DEVTYPE

0xFCC

0x00000016: Processor Performance

monitor

0x00000013: Processor Trace unit

Only used by CoreSight debug

Can classify unknown ‘new’ components

PID4

0xFD0

0x04 : 4kB component, ARM

Size of address block, and part of designer ID

PID3,PID2,PID1,PID0

0xFE0-0xFEC

0x004BB906 : ARM CTI rev4

0x003BB912 : ARM TPIU rev 3

- Unique part identifier consisting of Designer (via JEP106 code)

- 3 digit part allocated by designer

- Part revision

- Part ECO identifier

- Part modified

CID3,CID2,CID1,CID0

0xFF0-0xFFC

0xB105900D : CoreSight Debug

0xB105100D : CoreSight ROM Table

Component identifier, indicates if the CoreSight layout is used. Other values might be used by ARM PrimeCells and

other components.

Example CoreSight discovery registers

At least one ROM table component must be present as a slave to any AP which contains debug components. This will be the APB-AP, or AHB-AP in the case of a Cortex-M system. Each ROM table contains a list of address offsets which can be used to locate component base addresses. These components can themselves be ROM tables, but each physical component or ROM table must appear only once in the expanded list of pointers.

The AP contains a base address register which must point to the master ROM table for that bus. Typically, this will occupy the lowest 4k block of the address space. The ROM table is a CoreSight component, and contains standardized identification registers. It also contains an identifier for the SoC as a whole which can be used by debug agents to look-up against a database of known devices. This lookup can provide information about SoC specific features.

Typically the ROM table hierarchy will match the design hierarchy of modules containing debug APB. In this way, larger systems can be constructed from sub-systems and clusters. As a result, the debug APB is often sparsely populated.

If you enjoyed this blog, check out part 3 of this series by clicking on the link below.

How to debug: CoreSight basics part 3

deepud over 9 years ago

Very helpful. Thank you.
- Cancel
- Up 0 Down
- Reply
- More
- Cancel
Vatsalya Thakur over 10 years ago

Quite resourceful. Please continue with your blogs on Coresight.
- Cancel
- Up 0 Down
- Reply
- More
- Cancel

Architectures and Processors blog

Scalable Matrix Extension: Expanding the Arm Intrinsics Search Engine

Chris Walsh

Arm is pleased to announce that the Arm Intrinsics Search Engine has been updated to include the Scalable Matrix Extension (SME) intrinsics, including both SME and SME2 intrinsics.
- October 3, 2025
Arm A-Profile Architecture developments 2025

Martin Weidmann

Each year, Arm publishes updates to the A-Profile architecture alongside full Instruction Set and System Register documentation. In 2025, the update is Armv9.7-A.
- October 2, 2025
When a barrier does not block: The pitfalls of partial order

Wathsala Vithanage

Acquire fences aren’t always enough. See how LDAPR exposed unsafe interleavings and what we did to patch the problem.
- September 15, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog