Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Tools, Software and IDEs blog Retargeting and Enabling Exceptions with an ELF Image
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • AEMv8 FVP
  • Embedded Software
  • Arm Compiler 6
  • Armv8-A
  • DS-5 Ultimate Edition
  • Fast Models
  • Arm Assembly
  • Generic Interupt Controller
  • PrimeCell UART (PL011)
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Retargeting and Enabling Exceptions with an ELF Image

tmeduthie
tmeduthie
March 2, 2018
11 minute read time.

In my last blog we built an executable image to print "hello world" to a terminal. The same tools are being used here. Since the files in this post are a little longer, and consist of a great number of symbolic constants, the full code is not always displayed. It is downloadable via a link at the end of this post. Now we will discuss semihosting and interrupts.

What Do You Mean It Has No Screen?!

Semihosting

Semihosting enables code running on a "target" system, the model, to interface with a debugger running on a "host" system, the computer, and use its I/O facilities. This gives us a way of interacting with a model or microcontroller, that may not possess I/O functionality. In our case, the printf() call in the code actually triggers a request to a connected debugger through the library function _sys_write. By running a fromelf command,

$ fromelf --text -c __image.axf --output=disasm.txt

we generate a disassembly of "__image.axf" in "disasm.txt". Within that we can find _sys_write which contains a HLT instruction. An attached debugger detects this halt as a semihosting operation and will handle it appropriately. It is possible to check if you're using semihosting by adding

__asm(".global __use_no_semihosting\n\t");

to main(). Linking the image will now throw an error for any functions that use semihosting.

Retargeting

Real embedded systems operate without sophisticated debuggers, but many library functions depend on semihosting. So we must change the relevant procedures, retarget them, to use the hardware of our target instead of the host system. For example, we could retarget printf() using the model's PL011 UART. To do this we must write a driver for the UART in "pl011_uart.c":

struct pl011_uart {
        volatile unsigned int UARTDR;        // +0x00
        volatile unsigned int UARTECR;       // +0x04
  const volatile unsigned int unused0[4];    // +0x08 to +0x14 reserved
  const volatile unsigned int UARTFR;        // +0x18 - RO
  const volatile unsigned int unused1;       // +0x1C reserved
        volatile unsigned int UARTILPR;      // +0x20
        volatile unsigned int UARTIBRD;      // +0x24
        volatile unsigned int UARTFBRD;      // +0x28
        volatile unsigned int UARTLCR_H;     // +0x2C
        volatile unsigned int UARTCR;        // +0x30
        volatile unsigned int UARTIFLS;      // +0x34
        volatile unsigned int UARTIMSC;      // +0x38
  const volatile unsigned int UARTRIS;       // +0x3C - RO
  const volatile unsigned int UARTMIS;       // +0x40 - RO
        volatile unsigned int UARTICR;       // +0x44 - WO
        volatile unsigned int UARTDMACR;     // +0x48
};

// Instance of the dual timer
struct pl011_uart* uart;


// ------------------------------------------------------------

void uartInit(void* addr) {
  uart = (struct pl011_uart*) addr;

  // Ensure UART is disabled
  uart->UARTCR  = 0x0;

  // Set UART 0 Registers
  uart->UARTECR   = 0x0;  // Clear the receive status (i.e. error) register
  uart->UARTLCR_H = 0x0 | PL011_LCR_WORD_LENGTH_8 | PL011_LCR_FIFO_DISABLE | PL011_LCR_ONE_STOP_BIT | PL011_LCR_PARITY_DISABLE | PL011_LCR_BREAK_DISABLE;

  uart->UARTIBRD = PL011_IBRD_DIV_38400;
  uart->UARTFBRD = PL011_FBRD_DIV_38400;

  uart->UARTIMSC = 0x0;                     // Mask out all UART interrupts
  uart->UARTICR  = PL011_ICR_CLR_ALL_IRQS;  // Clear interrupts

  uart->UARTCR  = 0x0 | PL011_CR_UART_ENABLE | PL011_CR_TX_ENABLE | PL011_CR_RX_ENABLE;

  return;
}

// ------------------------------------------------------------

int fputc(int c, FILE *f) {
  // Wait until FIFO or TX register has space
  while ((uart->UARTFR & PL011_FR_TXFF_FLAG) != 0x0) {}

  // Write packet into FIFO/tx register
  uart->UARTDR = c;

  // Model requires us to manually send a carriage return
  if ((char)c == '\n') {
    while ((uart->UARTFR & PL011_FR_TXFF_FLAG) != 0x0){}
    uart->UARTDR = '\r';
  }
  return 0;
}

Not forgetting modifications to "hello_world.c":

#include <stdio.h>
#include "pl011_uart.h"

int main (void) {
  uartInit((void*)(0x1C090000));
  printf("hello world\n");
  return 0;
}

By redefining fputc() to use the UART we have retargeted printf(). Rebuilding the image:

$ armclang -c -g --target=aarch64-arm-none-eabi startup.s
$ armclang -c -g --target=aarch64-arm-none-eabi hello_world.c
$ armclang -c -g --target=aarch64-arm-none-eabi pl011_uart.c
$ armlink --scatter=scatter.txt --entry=start64 main.o startup.o pl011_uart.o

Disassembling the image will show no calls to _sys_write. Although other semihosting functions such as _sys_exit will be present. Now that we are no longer printing to the debug interface, we must use a Telnet client to interface with the UART. This should happen automatically if you attach a debugger. Alternatively one can start a Telnet client manually but it must be on port 5000, instead of the default port 23. Timing this is tricky, you must start the client just before the model's server starts listening.

Telnet 1

Exceptional Interrupts

In order to add meaningful functionality to an embedded system we must enable asynchronous exceptions: IRQs, FIQs, and SErrors. We will not have time to explore all the relevant architectural features, but a guide and online course are available for readers who are unfamiliar with them. Asynchronous exceptions are taken when something external to the current flow of execution must be handled by the CPU. An example might be a power switch, the processor must stop what it's doing and branch to a handler which ensures the shutdown is done properly.

Exception Routing

A summary of the architecture's instructions and registers may be useful for this section. In addition we must keep in mind some rules exceptions obey, particularly:

  • An exception routed to a higher Exception level cannot be masked, apart from EL0 to EL1 which can be masked with PSTATE
  • An exception routed to a lower Exception level is always masked
  • An exception routed to the current Exception level can be masked by PSTATE

We want to enable exception routing to EL3, using the Secure Configuration Register, SCR_EL3. Also, we set the Vector Based Address Register, VBAR_EL3, to point to a vector table, which must be defined. Finally we disable masking, i.e. ignoring, of exceptions at EL3 by PSTATE. Adding the following assembly code to "startup.s", accomplishes this.

  // Configure SCR_EL3
  // ------------------
  MOV      w1, #0              // Initial value of register is unknown
  ORR      w1, w1, #(1 << 3)   // Set EA bit (SError routed to EL3)
  ORR      w1, w1, #(1 << 2)   // Set FIQ bit (FIQs routed to EL3)
  ORR      w1, w1, #(1 << 1)   // Set IRQ bit (IRQs routed to EL3)
  MSR      SCR_EL3, x1


  // Install vector table
  // ---------------------
  .global vectors
  LDR  x0, =vectors
  MSR  VBAR_EL3, x0

  ISB

  // Clear interrupt masks
  // ----------------------
  MSR      DAIFClr, #0xF

Vector Tables

When any exception is taken the processor must handle the exception correctly. This entails detecting the type of exception, then branching to an appropriate handler. In "vectors.s":

   .section  VECTORS,"ax"
   .align 12

   .global vectors
 vectors:

   .global fiqHandler

 // ------------------------------------------------------------
 // Current EL with SP0
 // ------------------------------------------------------------

   .balign 128
 sync_current_el_sp0:
   B        .                    //        Synchronous

   .balign 128
 irq_current_el_sp0:
   B        .                    //        IRQ

   .balign 128
 fiq_current_el_sp0:
   B        fiqFirstLevelHandler //        FIQ

   .balign 128
 serror_current_el_sp0:
   B        .                    //        SError

   //
   // NB, CODE OMITTED!
   //

fiqFirstLevelHandler:
  STP      x29, x30, [sp, #-16]!
  STP      x18, x19, [sp, #-16]!
  STP      x16, x17, [sp, #-16]!
  STP      x14, x15, [sp, #-16]!
  STP      x12, x13, [sp, #-16]!
  STP      x10, x11, [sp, #-16]!
  STP      x8, x9, [sp, #-16]!
  STP      x6, x7, [sp, #-16]!
  STP      x4, x5, [sp, #-16]!
  STP      x2, x3, [sp, #-16]!
  STP      x0, x1, [sp, #-16]!

  BL       fiqHandler

  LDP      x0, x1, [sp], #16
  LDP      x2, x3, [sp], #16
  LDP      x4, x5, [sp], #16
  LDP      x6, x7, [sp], #16
  LDP      x8, x9, [sp], #16
  LDP      x10, x11, [sp], #16
  LDP      x12, x13, [sp], #16
  LDP      x14, x15, [sp], #16
  LDP      x16, x17, [sp], #16
  LDP      x18, x19, [sp], #16
  LDP      x29, x30, [sp], #16
  ERET

In this case only an FIQ entry and handler have been defined. In fact, fiqFirstLevelHandler merely branches to fiqHandler, a procedure we will define later in c. Here we have used the BL instruction, or branch with link, which branches to the given label and also saves the current value of the program counter plus four bytes, i.e. the next instruction address, to x30. Our c procedure will end with a return statement which compiles to a RET instruction, and the branch address of RET happens to be whatever is stored in x30 if no other register is specified. The handler also saves and restores all the general purpose registers, as they may be modified by the procedure call. The remaining entries branch to self, as the handlers have not been written.

Interrupt Controller

Vector tables have a relatively small and fixed number of entries, as the number and type of exceptions are architecturally defined. But we may require a great number of different interrupts triggered by different sources. So an additional piece of hardware is needed to manage these. Arm's Generic Interrupt Controller, or GIC, does exactly this. A full discussion of the GIC and its features is beyond the scope of this discussion, however a programmers guide is available. In "gic.s",

  .global gicInit
  .type gicInit, "function"
gicInit:
  // Configure Distributor
  MOV      x0, #GICDbase  // Address of GIC

  // Set ARE bits and group enables in the Distributor
  ADD      x1, x0, #GICD_CTLRoffset
  MOV      x2,     #GICD_CTLR.ARE_NS
  ORR      x2, x2, #GICD_CTLR.ARE_S
  STR      w2, [x1]

  ORR      x2, x2, #GICD_CTLR.EnableG0
  ORR      x2, x2, #GICD_CTLR.EnableG1S
  ORR      x2, x2, #GICD_CTLR.EnableG1NS
  STR      w2, [x1]
  DSB      SY

  // Configure Redistributor
  // Clearing ProcessorSleep signals core is awake
  MOV      x0, #RDbase
  MOV      x1, #GICR_WAKERoffset
  ADD      x1, x1, x0
  STR      wzr, [x1]
  DSB      SY
1:   // We now have to wait for ChildrenAsleep to read 0
  LDR      w0, [x1]
  AND      w0, w0, #0x6
  CBNZ     w0, 1b

  // Configure CPU interface
  // We need to set the SRE bits for each EL to enable
  // access to the interrupt controller registers
  MOV      x0, #ICC_SRE_ELn.Enable
  ORR      x0, x0, ICC_SRE_ELn.SRE
  MSR      ICC_SRE_EL3, x0
  ISB
  MSR      ICC_SRE_EL1, x0
  MRS      x1, SCR_EL3
  ORR      x1, x1, #1  // Set NS bit, to access Non-secure registers
  MSR      SCR_EL3, x1
  ISB
  MSR      ICC_SRE_EL2, x0
  ISB
  MSR      ICC_SRE_EL1, x0

  MOV      w0, #0xFF
  MSR      ICC_PMR_EL1, x0 // Set PMR to lowest priority

  MOV      w0, #3
  MSR      ICC_IGRPEN1_EL3, x0
  MSR      ICC_IGRPEN0_EL1, x0

//-------------------------------------------------------------
//Secure Physical Timer source defined
  MOV      x0, #SGIbase       // Address of Redistributor registers

  ADD      x1, x0, #GICR_IGROUPRoffset
  STR      wzr, [x1]          // Mark INTIDs 0..31 as Secure

  ADD      x1, x0, #GICR_IGRPMODRoffset
  STR      wzr, [x1]          // Mark INTIDs 0..31 as Secure Group 0

  ADD      x1, x0, #GICR_ISENABLERoffset
  MOV      w2, #(1 << 29)     // Enable INTID 29
  STR      w2, [x1]           // Enable interrupt source

  RET

// ------------------------------------------------------------

  .global readIAR0
  .type readIAR0, "function"
readIAR0:
  MRS       x0, ICC_IAR0_EL1  // Read ICC_IAR0_EL1 into x0
  RET

// ------------------------------------------------------------

  .global writeEOIR0
  .type writeEOIR0, "function"
writeEOIR0:
  MSR        ICC_EOIR0_EL1, x0 // Write x0 to ICC_EOIR0_EL1
  RET

It is worth nothing that we have defined the functions readIAR0 and writeEOIR0, using the .global and .type assembler directives. The former makes the label visible to all files given to the linker, while the latter allows us to define a type for the label. As we will see later, we have done this so we can call these assembly functions from some c code following the procedure call standard. The standard defines numerous things, including how values are passed and returned, with two key points being:

  • Arguments are passed in x0 to x7 in the same order as the function prototype
  • Values are returned to the registers x0 and x1

Using readIAR0 we read the value of the Interrupt Controller Interrupt Acknowledge Register 0, ICC_IAR0_EL1. The lower 24 bits of this register give the interrupt identifier, INTID. By calling readIAR0 in c we can get the INTID from the GIC and then handle different interrupts case by case. Later in the c code fiqHandler() is defined, and you'll see a call to writeEOIR0. The INTID is passed to x0 then written to the Interrupt Controller End of Interrupt Register 0, ICC_EOIR0_EL1, which tells the processer that that interrupt is complete.

Timer

At this point we have enabled the GIC, and defined a source of interrupts from a secure physical timer. We have a system timer which we read using a comparator in the processor. We can also tell the hardware to generate an interrupt request after a set number of system ticks. Finally we must have some way of disabling the comparator, so it does not continue to interrupt the processor after the ticks have elapsed. Enabling the timer and defining its behaviour in "timer.s":

  .section  AArch64_GenericTimer,"ax"
  .align 3


// ------------------------------------------------------------

  .global setTimerPeriod
  // void setTimerPeriod(uint32_t value)
  // Sets the value of the Secure EL1 Physical Timer Value Register (CNTPS_TVAL_EL1)
  // w0 - value - The value to be written into CNTPS_TVAL_EL1
  .type setTimerPeriod, "function"
setTimerPeriod:
  MSR     CNTPS_TVAL_EL1, x0
  ISB
  RET

// ------------------------------------------------------------

  .global enableTimer
  .type enableTimer, "function"
enableTimer:
  MOV    x0, #0x1            // Set Enable bit, and clear Mask bit
  MSR    CNTPS_CTL_EL1, x0
  ISB
  RET

// ------------------------------------------------------------

  .global disableTimer
  .type disableTimer, "function"
disableTimer:
  MSR    CNTPS_CTL_EL1, xzr // Clear the enable bit
  ISB
  RET

Combining Everything

Given all the changes we have made, "hello_world.c" must be altered:

#include <stdio.h>
#include <stdint.h>
#include "pl011_uart.h"

extern void gicInit(void);
extern uint32_t readIAR0(void);
extern void writeEOIR0(uint32_t);

extern void setTimerPeriod(uint32_t);
extern void enableTimer(void);
extern void disableTimer(void);

volatile uint32_t flag;

int main () {
  uartInit((void*)(0x1C090000));
  gicInit();

	printf("hello world\n");

  flag = 0;
  setTimerPeriod(0x1000);  // Generate an interrupt in 1000 ticks
  enableTimer();

  // Wait for the interrupt to arrive
  while(flag==0){}

  printf("Got interrupt!\n");

	return 0;
}

void fiqHandler(void) {
  uint32_t intid;
  intid = readIAR0(); // Read the interrupt id

  if (intid == 29) {
    flag = 1;
    disableTimer();
  } else {
    printf("Should never reach here!\n");
  }

  writeEOIR0(intid);
	return;
}

Here we have defined fiqHandler() to produce the desired behaviour when the interrupt is triggered. We can now build and run the project:

$ armclang -c -g --target=aarch64-arm-none-eabi startup.s
$ armclang -c -g --target=aarch64-arm-none-eabi vectors.s
$ armclang -c -g --target=aarch64-arm-none-eabi gic.s
$ armclang -c -g --target=aarch64-arm-none-eabi timer.s
$ armclang -c -g --target=aarch64-arm-none-eabi main.c
$ armclang -c -g --target=aarch64-arm-none-eabi pl011_uart.c
$ armlink --scatter=scatter.txt --entry=start64 startup.o vectors.o gic.o timer.o  main.o pl011_uart.o

Including the flag -C bp.refcounter.non_arch_start_at_default=1 enables the system counter on the model. Running the image now:

$ FVP_Base_AEMv8A -C bp.refcounter.non_arch_start_at_default=1 -a __image.axf

Telnet 2

retargeting_and_exceptions_blog_source.zip
Anonymous
Tools, Software and IDEs blog
  • Python on Arm: 2025 Update

    Diego Russo
    Diego Russo
    Python powers applications across Machine Learning (ML), automation, data science, DevOps, web development, and developer tooling.
    • August 21, 2025
  • Product update: Arm Development Studio 2025.0 now available

    Stephen Theobald
    Stephen Theobald
    Arm Development Studio 2025.0 now available with Arm Toolchain for Embedded Professional.
    • July 18, 2025
  • GCC 15: Continuously Improving

    Tamar Christina
    Tamar Christina
    GCC 15 brings major Arm optimizations: enhanced vectorization, FP8 support, Neoverse tuning, and 3–5% performance gains on SPEC CPU 2017.
    • June 26, 2025