This blog is the first of a short series which will explore the process of writing a program for an embedded system. Here we will use Arm's Compiler 6 toolchain to build an executable image for the AEMv8 Base Platform Model. Both of these are included in DS-5 Ultimate edition, whose command line tool has been used in the examples. If you are following the commands, it is important that you navigate to a directory containing all the files used. For those who prefer using a GUI, there are some thorough tutorials covering the components of DS-5.
Consider a simple c program:
#include <stdio.h> int main() { printf("Hello World\n"); return 0; }
Using armclang, with these flags, will compile "hello_world.c" and generate an ELF object file "hello_world.o":
$ armclang -c -g --target=aarch64-arm-none-eabi hello_world.c
Next we link the object using armlink. This will generate an ELF image file named "__image.axf".
$ armlink hello_world.o
Since we have not specified an entry point, the program counter will default to __main() in the Arm libraries. These libraries will then set up the image, which will eventually branch to main().
__main()
main()
You'll find that the image we have generated does not run on the AEMv8-A Base Platform, because the default memory map provided by armlink is not the one used by the model. So we must provide the memory map which, for the AEMv8-A Base Platform, looks something like this:
Now we need to pass this to the linker by using a scatter file. In "scatter.txt", we write the following:
ROM_LOAD 0x00000000 0x00010000 { ROM_EXEC +0x0 { * (+RO) } RAM_EXEC 0x04000000 0x10000 { * (+RW, +ZI) } ARM_LIB_STACKHEAP 0x04010000 EMPTY 0x10000 {} }
These statements define regions of memory with a purpose, and you can find out more about them in the armlink documentation. Considering them sequentially:
ROM_LOAD 0x00000000 0x00010000 {...}
This defines a load region, an area of memory where the image file is at reset. The first number specified gives the starting address of the region, and the second one gives the size of the region.
ROM_EXEC +0x0 { * (+RO) }
An Execution Region can be described as a Root Region if it has the same load time and execute time address. ROM_EXEC qualifies as a Root Region, because we have located at an offset of 0, +0x0 defines this, from the start of the Load region it is a member of.
+0x0
RAM_EXEC 0x04000000 0x10000 { * (+RW, +ZI) }
RAM_EXEC contains any read-write (RW) or zero-initialised (ZI) data. Since this has been placed in SRAM, it is not a root region.
RAM_EXEC
RW
ZI
ARM_LIB_STACKHEAP 0x04010000 EMPTY 0x10000 {}
Specifies the placement of the heap, which starts at 0x04010000 and grows upward, and the stack, which starts at 0x0401FFFF and grows downwards. The EMPTY declaration reserves 0x10000 of uninitialised memory, starting at 0x04010000. ARM_LIB_STACKHEAP and EMPTY are syntactically significant for the linker. However ROM_LOAD, ROM_EXEC, and RAM_EXEC are not and could be renamed.
0x04010000
0x0401FFFF
EMPTY
0x10000
ARM_LIB_STACKHEAP
ROM_LOAD
ROM_EXEC
Now that we have a valid scatter file, we can re-build the image:
$ armlink --scatter=scatter.txt hello_world.o
Which can now be run from the command line using Fast Models:
$ FVP_Base_AEMv8A __image.axf
The model will boot up multiple cores and this could lead to strange or inconsistent behaviours, such as multiple “hello world” prints. We did not define a reset handler to initialise the other cores. This can be avoided with the model flag -C cluster0.NUM_CORES=1.
-C cluster0.NUM_CORES=1
At reset, the image's code and data will be in the ROM_LOAD section. The library function __main() is responsible for copying the RW and ZI data, while __rt_entry() sets up the stack and heap. In the documentation, this process is referred to as scatter loading.
__rt_entry()
It is typical that an embedded system requires some low level initialization in order to function. Examples might include: configuring the MMU, installing vector tables, and adjusting system registers. Often this must occur before any other code has executed. So we must define and change the entry point for the system in a way which reflects the following execution flow:
By default the model enters the AArch64 execution state in the EL3 exception level, with a reset address of 0x0000_0000. Consider this "startup.s" code:
0x0000_0000
.section BOOT,"ax" // Define an executable ELF section, BOOT .align 3 // Align to 2^3 byte boundary .global start64 .type start64, "function" start64: // Which core am I // ---------------- MRS x0, MPIDR_EL1 AND x0, x0, #0xFFFF // Mask off to leave Aff0 and Aff1 CBZ x0, boot // If not *.*.0.0, then go to sleep sleep: WFI B sleep boot: // Disable trapping of CPTR_EL3 accesses or use of Adv.SIMD/FPU // ------------------------------------------------------------- MSR CPTR_EL3, xzr // Clear all trap bits // Branch to scatter loading and C library init code .global __main B __main
By checking which core the code is running on, we sidestep the issue of multicore boot by sending all but one core to sleep. Additionally, the status of the floating point unit (FPU) in the model is unknown; Architectural Feature Trap Register, CPTR_EL3, has no defined reset value. By setting CPTR_EL3 to zero, we have disabled trapping of SIMD, FPU, and a few other instructions.
CPTR_EL3
Since we have added code, we must now ensure it is compiled correctly:
$ armclang -c -g --target=aarch64-arm-none-eabi startup.s
To link the resulting "startup.o" file, we must modify the ROM_EXEC region in our scatter file:
ROM_EXEC +0x0 { startup.o(BOOT, +FIRST) * (+RO) }
Adding the line startup.o(BOOT, FIRST) ensures that the BOOT section of our startup file is placed first in the ROM_EXEC region. Now we are in a position to link the objects, while specifying an entry label for the linker which is where the execution branches to on reset.
startup.o(BOOT, FIRST)
BOOT
$ armlink --scatter=scatter.txt --entry=start64 hello_world.o startup.o
Running the resulting image will now print a single "hello world" to the console.
Please find the source file below for you to download:
Nice tutoriel :)