In my last blog we built an executable image to print "hello world" to a terminal. The same tools are being used here. Since the files in this post are a little longer, and consist of a great number of symbolic constants, the full code is not always displayed. It is downloadable via a link at the end of this post. Now we will discuss semihosting and interrupts.
Semihosting enables code running on a "target" system, the model, to interface with a debugger running on a "host" system, the computer, and use its I/O facilities. This gives us a way of interacting with a model or microcontroller, that may not possess I/O functionality. In our case, the printf() call in the code actually triggers a request to a connected debugger through the library function _sys_write. By running a fromelf command,
printf()
_sys_write
$ fromelf --text -c __image.axf --output=disasm.txt
we generate a disassembly of "__image.axf" in "disasm.txt". Within that we can find _sys_write which contains a HLT instruction. An attached debugger detects this halt as a semihosting operation and will handle it appropriately. It is possible to check if you're using semihosting by adding
HLT
__asm(".global __use_no_semihosting\n\t");
to main(). Linking the image will now throw an error for any functions that use semihosting.
main()
Real embedded systems operate without sophisticated debuggers, but many library functions depend on semihosting. So we must change the relevant procedures, retarget them, to use the hardware of our target instead of the host system. For example, we could retarget printf() using the model's PL011 UART. To do this we must write a driver for the UART in "pl011_uart.c":
struct pl011_uart { volatile unsigned int UARTDR; // +0x00 volatile unsigned int UARTECR; // +0x04 const volatile unsigned int unused0[4]; // +0x08 to +0x14 reserved const volatile unsigned int UARTFR; // +0x18 - RO const volatile unsigned int unused1; // +0x1C reserved volatile unsigned int UARTILPR; // +0x20 volatile unsigned int UARTIBRD; // +0x24 volatile unsigned int UARTFBRD; // +0x28 volatile unsigned int UARTLCR_H; // +0x2C volatile unsigned int UARTCR; // +0x30 volatile unsigned int UARTIFLS; // +0x34 volatile unsigned int UARTIMSC; // +0x38 const volatile unsigned int UARTRIS; // +0x3C - RO const volatile unsigned int UARTMIS; // +0x40 - RO volatile unsigned int UARTICR; // +0x44 - WO volatile unsigned int UARTDMACR; // +0x48 }; // Instance of the dual timer struct pl011_uart* uart; // ------------------------------------------------------------ void uartInit(void* addr) { uart = (struct pl011_uart*) addr; // Ensure UART is disabled uart->UARTCR = 0x0; // Set UART 0 Registers uart->UARTECR = 0x0; // Clear the receive status (i.e. error) register uart->UARTLCR_H = 0x0 | PL011_LCR_WORD_LENGTH_8 | PL011_LCR_FIFO_DISABLE | PL011_LCR_ONE_STOP_BIT | PL011_LCR_PARITY_DISABLE | PL011_LCR_BREAK_DISABLE; uart->UARTIBRD = PL011_IBRD_DIV_38400; uart->UARTFBRD = PL011_FBRD_DIV_38400; uart->UARTIMSC = 0x0; // Mask out all UART interrupts uart->UARTICR = PL011_ICR_CLR_ALL_IRQS; // Clear interrupts uart->UARTCR = 0x0 | PL011_CR_UART_ENABLE | PL011_CR_TX_ENABLE | PL011_CR_RX_ENABLE; return; } // ------------------------------------------------------------ int fputc(int c, FILE *f) { // Wait until FIFO or TX register has space while ((uart->UARTFR & PL011_FR_TXFF_FLAG) != 0x0) {} // Write packet into FIFO/tx register uart->UARTDR = c; // Model requires us to manually send a carriage return if ((char)c == '\n') { while ((uart->UARTFR & PL011_FR_TXFF_FLAG) != 0x0){} uart->UARTDR = '\r'; } return 0; }
Not forgetting modifications to "hello_world.c":
#include <stdio.h> #include "pl011_uart.h" int main (void) { uartInit((void*)(0x1C090000)); printf("hello world\n"); return 0; }
By redefining fputc() to use the UART we have retargeted printf(). Rebuilding the image:
fputc()
$ armclang -c -g --target=aarch64-arm-none-eabi startup.s $ armclang -c -g --target=aarch64-arm-none-eabi hello_world.c $ armclang -c -g --target=aarch64-arm-none-eabi pl011_uart.c $ armlink --scatter=scatter.txt --entry=start64 main.o startup.o pl011_uart.o
Disassembling the image will show no calls to _sys_write. Although other semihosting functions such as _sys_exit will be present. Now that we are no longer printing to the debug interface, we must use a Telnet client to interface with the UART. This should happen automatically if you attach a debugger. Alternatively one can start a Telnet client manually but it must be on port 5000, instead of the default port 23. Timing this is tricky, you must start the client just before the model's server starts listening.
_sys_exit
In order to add meaningful functionality to an embedded system we must enable asynchronous exceptions: IRQs, FIQs, and SErrors. We will not have time to explore all the relevant architectural features, but a guide and online course are available for readers who are unfamiliar with them. Asynchronous exceptions are taken when something external to the current flow of execution must be handled by the CPU. An example might be a power switch, the processor must stop what it's doing and branch to a handler which ensures the shutdown is done properly.
A summary of the architecture's instructions and registers may be useful for this section. In addition we must keep in mind some rules exceptions obey, particularly:
PSTATE
We want to enable exception routing to EL3, using the Secure Configuration Register, SCR_EL3. Also, we set the Vector Based Address Register, VBAR_EL3, to point to a vector table, which must be defined. Finally we disable masking, i.e. ignoring, of exceptions at EL3 by PSTATE. Adding the following assembly code to "startup.s", accomplishes this.
SCR_EL3
VBAR_EL3
// Configure SCR_EL3 // ------------------ MOV w1, #0 // Initial value of register is unknown ORR w1, w1, #(1 << 3) // Set EA bit (SError routed to EL3) ORR w1, w1, #(1 << 2) // Set FIQ bit (FIQs routed to EL3) ORR w1, w1, #(1 << 1) // Set IRQ bit (IRQs routed to EL3) MSR SCR_EL3, x1 // Install vector table // --------------------- .global vectors LDR x0, =vectors MSR VBAR_EL3, x0 ISB // Clear interrupt masks // ---------------------- MSR DAIFClr, #0xF
When any exception is taken the processor must handle the exception correctly. This entails detecting the type of exception, then branching to an appropriate handler. In "vectors.s":
.section VECTORS,"ax" .align 12 .global vectors vectors: .global fiqHandler // ------------------------------------------------------------ // Current EL with SP0 // ------------------------------------------------------------ .balign 128 sync_current_el_sp0: B . // Synchronous .balign 128 irq_current_el_sp0: B . // IRQ .balign 128 fiq_current_el_sp0: B fiqFirstLevelHandler // FIQ .balign 128 serror_current_el_sp0: B . // SError // // NB, CODE OMITTED! // fiqFirstLevelHandler: STP x29, x30, [sp, #-16]! STP x18, x19, [sp, #-16]! STP x16, x17, [sp, #-16]! STP x14, x15, [sp, #-16]! STP x12, x13, [sp, #-16]! STP x10, x11, [sp, #-16]! STP x8, x9, [sp, #-16]! STP x6, x7, [sp, #-16]! STP x4, x5, [sp, #-16]! STP x2, x3, [sp, #-16]! STP x0, x1, [sp, #-16]! BL fiqHandler LDP x0, x1, [sp], #16 LDP x2, x3, [sp], #16 LDP x4, x5, [sp], #16 LDP x6, x7, [sp], #16 LDP x8, x9, [sp], #16 LDP x10, x11, [sp], #16 LDP x12, x13, [sp], #16 LDP x14, x15, [sp], #16 LDP x16, x17, [sp], #16 LDP x18, x19, [sp], #16 LDP x29, x30, [sp], #16 ERET
In this case only an FIQ entry and handler have been defined. In fact, fiqFirstLevelHandler merely branches to fiqHandler, a procedure we will define later in c. Here we have used the BL instruction, or branch with link, which branches to the given label and also saves the current value of the program counter plus four bytes, i.e. the next instruction address, to x30. Our c procedure will end with a return statement which compiles to a RET instruction, and the branch address of RET happens to be whatever is stored in x30 if no other register is specified. The handler also saves and restores all the general purpose registers, as they may be modified by the procedure call. The remaining entries branch to self, as the handlers have not been written.
fiqFirstLevelHandler
fiqHandler
BL
x30
RET
Vector tables have a relatively small and fixed number of entries, as the number and type of exceptions are architecturally defined. But we may require a great number of different interrupts triggered by different sources. So an additional piece of hardware is needed to manage these. Arm's Generic Interrupt Controller, or GIC, does exactly this. A full discussion of the GIC and its features is beyond the scope of this discussion, however a programmers guide is available. In "gic.s",
.global gicInit .type gicInit, "function" gicInit: // Configure Distributor MOV x0, #GICDbase // Address of GIC // Set ARE bits and group enables in the Distributor ADD x1, x0, #GICD_CTLRoffset MOV x2, #GICD_CTLR.ARE_NS ORR x2, x2, #GICD_CTLR.ARE_S STR w2, [x1] ORR x2, x2, #GICD_CTLR.EnableG0 ORR x2, x2, #GICD_CTLR.EnableG1S ORR x2, x2, #GICD_CTLR.EnableG1NS STR w2, [x1] DSB SY // Configure Redistributor // Clearing ProcessorSleep signals core is awake MOV x0, #RDbase MOV x1, #GICR_WAKERoffset ADD x1, x1, x0 STR wzr, [x1] DSB SY 1: // We now have to wait for ChildrenAsleep to read 0 LDR w0, [x1] AND w0, w0, #0x6 CBNZ w0, 1b // Configure CPU interface // We need to set the SRE bits for each EL to enable // access to the interrupt controller registers MOV x0, #ICC_SRE_ELn.Enable ORR x0, x0, ICC_SRE_ELn.SRE MSR ICC_SRE_EL3, x0 ISB MSR ICC_SRE_EL1, x0 MRS x1, SCR_EL3 ORR x1, x1, #1 // Set NS bit, to access Non-secure registers MSR SCR_EL3, x1 ISB MSR ICC_SRE_EL2, x0 ISB MSR ICC_SRE_EL1, x0 MOV w0, #0xFF MSR ICC_PMR_EL1, x0 // Set PMR to lowest priority MOV w0, #3 MSR ICC_IGRPEN1_EL3, x0 MSR ICC_IGRPEN0_EL1, x0 //------------------------------------------------------------- //Secure Physical Timer source defined MOV x0, #SGIbase // Address of Redistributor registers ADD x1, x0, #GICR_IGROUPRoffset STR wzr, [x1] // Mark INTIDs 0..31 as Secure ADD x1, x0, #GICR_IGRPMODRoffset STR wzr, [x1] // Mark INTIDs 0..31 as Secure Group 0 ADD x1, x0, #GICR_ISENABLERoffset MOV w2, #(1 << 29) // Enable INTID 29 STR w2, [x1] // Enable interrupt source RET // ------------------------------------------------------------ .global readIAR0 .type readIAR0, "function" readIAR0: MRS x0, ICC_IAR0_EL1 // Read ICC_IAR0_EL1 into x0 RET // ------------------------------------------------------------ .global writeEOIR0 .type writeEOIR0, "function" writeEOIR0: MSR ICC_EOIR0_EL1, x0 // Write x0 to ICC_EOIR0_EL1 RET
It is worth nothing that we have defined the functions readIAR0 and writeEOIR0, using the .global and .type assembler directives. The former makes the label visible to all files given to the linker, while the latter allows us to define a type for the label. As we will see later, we have done this so we can call these assembly functions from some c code following the procedure call standard. The standard defines numerous things, including how values are passed and returned, with two key points being:
readIAR0
writeEOIR0
.global
.type
x0
x7
x1
Using readIAR0 we read the value of the Interrupt Controller Interrupt Acknowledge Register 0, ICC_IAR0_EL1. The lower 24 bits of this register give the interrupt identifier, INTID. By calling readIAR0 in c we can get the INTID from the GIC and then handle different interrupts case by case. Later in the c code fiqHandler() is defined, and you'll see a call to writeEOIR0. The INTID is passed to x0 then written to the Interrupt Controller End of Interrupt Register 0, ICC_EOIR0_EL1, which tells the processer that that interrupt is complete.
fiqHandler()
ICC_EOIR0_EL1
At this point we have enabled the GIC, and defined a source of interrupts from a secure physical timer. We have a system timer which we read using a comparator in the processor. We can also tell the hardware to generate an interrupt request after a set number of system ticks. Finally we must have some way of disabling the comparator, so it does not continue to interrupt the processor after the ticks have elapsed. Enabling the timer and defining its behaviour in "timer.s":
.section AArch64_GenericTimer,"ax" .align 3 // ------------------------------------------------------------ .global setTimerPeriod // void setTimerPeriod(uint32_t value) // Sets the value of the Secure EL1 Physical Timer Value Register (CNTPS_TVAL_EL1) // w0 - value - The value to be written into CNTPS_TVAL_EL1 .type setTimerPeriod, "function" setTimerPeriod: MSR CNTPS_TVAL_EL1, x0 ISB RET // ------------------------------------------------------------ .global enableTimer .type enableTimer, "function" enableTimer: MOV x0, #0x1 // Set Enable bit, and clear Mask bit MSR CNTPS_CTL_EL1, x0 ISB RET // ------------------------------------------------------------ .global disableTimer .type disableTimer, "function" disableTimer: MSR CNTPS_CTL_EL1, xzr // Clear the enable bit ISB RET
Given all the changes we have made, "hello_world.c" must be altered:
#include <stdio.h> #include <stdint.h> #include "pl011_uart.h" extern void gicInit(void); extern uint32_t readIAR0(void); extern void writeEOIR0(uint32_t); extern void setTimerPeriod(uint32_t); extern void enableTimer(void); extern void disableTimer(void); volatile uint32_t flag; int main () { uartInit((void*)(0x1C090000)); gicInit(); printf("hello world\n"); flag = 0; setTimerPeriod(0x1000); // Generate an interrupt in 1000 ticks enableTimer(); // Wait for the interrupt to arrive while(flag==0){} printf("Got interrupt!\n"); return 0; } void fiqHandler(void) { uint32_t intid; intid = readIAR0(); // Read the interrupt id if (intid == 29) { flag = 1; disableTimer(); } else { printf("Should never reach here!\n"); } writeEOIR0(intid); return; }
Here we have defined fiqHandler() to produce the desired behaviour when the interrupt is triggered. We can now build and run the project:
$ armclang -c -g --target=aarch64-arm-none-eabi startup.s $ armclang -c -g --target=aarch64-arm-none-eabi vectors.s $ armclang -c -g --target=aarch64-arm-none-eabi gic.s $ armclang -c -g --target=aarch64-arm-none-eabi timer.s $ armclang -c -g --target=aarch64-arm-none-eabi main.c $ armclang -c -g --target=aarch64-arm-none-eabi pl011_uart.c $ armlink --scatter=scatter.txt --entry=start64 startup.o vectors.o gic.o timer.o main.o pl011_uart.o
Including the flag -C bp.refcounter.non_arch_start_at_default=1 enables the system counter on the model. Running the image now:
-C bp.refcounter.non_arch_start_at_default=1
$ FVP_Base_AEMv8A -C bp.refcounter.non_arch_start_at_default=1 -a __image.axf