Building an ELF Image for an Armv8-A Fixed Virtual Platform

January 26, 2018

5 minute read time.

This blog is the first of a short series which will explore the process of writing a program for an embedded system. Here we will use Arm's Compiler 6 toolchain to build an executable image for the AEMv8 Base Platform Model. Both of these are included in DS-5 Ultimate edition, whose command line tool has been used in the examples. If you are following the commands, it is important that you navigate to a directory containing all the files used. For those who prefer using a GUI, there are some thorough tutorials covering the components of DS-5.

Hello World!

Consider a simple c program:

#include <stdio.h>

int main() {
  printf("Hello World\n");
  return 0;
}

Using armclang, with these flags, will compile "hello_world.c" and generate an ELF object file "hello_world.o":

$ armclang -c -g --target=aarch64-arm-none-eabi hello_world.c

Compile only, do not link: -c
Include debug information in the image: -g
Target the Armv8-A AArch64 ABI: --target=aarch64-arm-none-eabi

Next we link the object using armlink. This will generate an ELF image file named "__image.axf".

$ armlink hello_world.o

Since we have not specified an entry point, the program counter will default to __main() in the Arm libraries. These libraries will then set up the image, which will eventually branch to main().

Default image execution flow

Scattered Memory...

Specifying the Memory Map

You'll find that the image we have generated does not run on the AEMv8-A Base Platform, because the default memory map provided by armlink is not the one used by the model. So we must provide the memory map which, for the AEMv8-A Base Platform, looks something like this:

AEMv8-A memory map

Now we need to pass this to the linker by using a scatter file. In "scatter.txt", we write the following:

ROM_LOAD 0x00000000 0x00010000
  {
    ROM_EXEC +0x0
    {
      * (+RO)
    }

    RAM_EXEC 0x04000000 0x10000
    {
      * (+RW, +ZI)
    }
    ARM_LIB_STACKHEAP 0x04010000 EMPTY 0x10000
    {}
  }

These statements define regions of memory with a purpose, and you can find out more about them in the armlink documentation. Considering them sequentially:

ROM_LOAD 0x00000000 0x00010000
  {...}

This defines a load region, an area of memory where the image file is at reset. The first number specified gives the starting address of the region, and the second one gives the size of the region.

ROM_EXEC +0x0
  {
    * (+RO)
  }

An Execution Region can be described as a Root Region if it has the same load time and execute time address. ROM_EXEC qualifies as a Root Region, because we have located at an offset of 0, +0x0 defines this, from the start of the Load region it is a member of.

RAM_EXEC 0x04000000 0x10000
    {
      * (+RW, +ZI)
    }

RAM_EXEC contains any read-write (RW) or zero-initialised (ZI) data. Since this has been placed in SRAM, it is not a root region.

ARM_LIB_STACKHEAP 0x04010000 EMPTY 0x10000
    {}

Specifies the placement of the heap, which starts at 0x04010000 and grows upward, and the stack, which starts at 0x0401FFFF and grows downwards. The EMPTY declaration reserves 0x10000 of uninitialised memory, starting at 0x04010000. ARM_LIB_STACKHEAP and EMPTY are syntactically significant for the linker. However ROM_LOAD, ROM_EXEC, and RAM_EXEC are not and could be renamed.

Rebuild and Run

Now that we have a valid scatter file, we can re-build the image:

$ armlink --scatter=scatter.txt hello_world.o

Which can now be run from the command line using Fast Models:

$ FVP_Base_AEMv8A __image.axf

The model will boot up multiple cores and this could lead to strange or inconsistent behaviours, such as multiple “hello world” prints. We did not define a reset handler to initialise the other cores. This can be avoided with the model flag -C cluster0.NUM_CORES=1.

Flow of image file data during execution

At reset, the image's code and data will be in the ROM_LOAD section. The library function __main() is responsible for copying the RW and ZI data, while __rt_entry() sets up the stack and heap. In the documentation, this process is referred to as scatter loading.

Gotta Start Somewhere

Reset Handlers

It is typical that an embedded system requires some low level initialization in order to function. Examples might include: configuring the MMU, installing vector tables, and adjusting system registers. Often this must occur before any other code has executed. So we must define and change the entry point for the system in a way which reflects the following execution flow:

Execution flow of program with reset handler

By default the model enters the AArch64 execution state in the EL3 exception level, with a reset address of 0x0000_0000. Consider this "startup.s" code:

  .section  BOOT,"ax" // Define an executable ELF section, BOOT
  .align 3                     // Align to 2^3 byte boundary

  .global start64
  .type start64, "function"
start64:


  // Which core am I
  // ----------------
  MRS      x0, MPIDR_EL1
  AND      x0, x0, #0xFFFF     // Mask off to leave Aff0 and Aff1
  CBZ      x0, boot            // If not *.*.0.0, then go to sleep
sleep:
  WFI
  B        sleep

boot:
  // Disable trapping of CPTR_EL3 accesses or use of Adv.SIMD/FPU
  // -------------------------------------------------------------
  MSR      CPTR_EL3, xzr       // Clear all trap bits

  // Branch to scatter loading and C library init code
  .global  __main
  B        __main

By checking which core the code is running on, we sidestep the issue of multicore boot by sending all but one core to sleep. Additionally, the status of the floating point unit (FPU) in the model is unknown; Architectural Feature Trap Register, CPTR_EL3, has no defined reset value. By setting CPTR_EL3 to zero, we have disabled trapping of SIMD, FPU, and a few other instructions.

Rebuilding the Image

Since we have added code, we must now ensure it is compiled correctly:

$ armclang -c -g --target=aarch64-arm-none-eabi startup.s

To link the resulting "startup.o" file, we must modify the ROM_EXEC region in our scatter file:

ROM_EXEC +0x0
  {
    startup.o(BOOT, +FIRST)
    * (+RO)
  }

Adding the line startup.o(BOOT, FIRST) ensures that the BOOT section of our startup file is placed first in the ROM_EXEC region. Now we are in a position to link the objects, while specifying an entry label for the linker which is where the execution branches to on reset.

$ armlink --scatter=scatter.txt --entry=start64 hello_world.o startup.o

Running the resulting image will now print a single "hello world" to the console.

Please find the source file below for you to download:

building_elf_blog_source.zip

1 comment
0 members are here

Tools, Software and IDEs blog

GitHub and Arm are transforming development on Windows for developers

Pareena Verma

Develop, test, and deploy natively on Windows on Arm with GitHub-hosted Arm runners—faster CI/CD, AI tooling, and full dev stack, no emulation needed.
- May 20, 2025
What is new in LLVM 20?

Volodymyr Turanskyy

Discover what's new in LLVM 20, including Armv9.6-A support, SVE2.1 features, and key performance and code generation improvements.
- April 29, 2025
Running KleidiAI MatMul kernels in a bare-metal Arm environment

Paul Black

Benchmarking Arm®︎ KleidiAI MatMul kernels on bare-metal with AC6, GCC, and ATfE compilers.
- April 17, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog