Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Decoding the Startup file for Arm Cortex-M4
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • GNU Assembler
  • Thumb
  • STM32
  • Tutorial
  • Cortex-M4
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Decoding the Startup file for Arm Cortex-M4

Gopal Amlekar
Gopal Amlekar
January 5, 2015
7 minute read time.

Introduction

This is my attempt to understand the startup file for an Arm Cortex M4 processor, specifically the STM32F4 (Cortex M4) processor. This document should help in giving a feel of assembly language for Arm and understanding how the Cortex M4 processor starts. Familiarity with the architecture of Cortex M4 is required to understand it better.

More importantly, I am looking forward for expert comments and corrections which will help me fill in the gaps in my knowledge.

I am not reproducing the startup code entirely here to avoid clutter. Please refer to the file uploaded. This file is part of the STMicroelectronics software pack along with KEIL MDK-Arm which means it uses the Arm assembler and not the GNU assembler.

Please ignore the line numbers appearing in the code snippets mentioned below. They do not correspond to the line numbers in the startup file.

Organization of the Startup code

There are 5 parts of the startup code.

  1. Declaration of the Stack area
  2. Declaration of the Heap area
  3. Vector table
  4. Reset handler code
  5. Other exception handler code

Stack Area

The assembly code is usually divided into different sections by the AREA directive. Let's first look at how the stack area is declared.

Stack_Size     EQU     0x00000400



This line declares a constant called Stack_Size of value 0x00000400. The EQU is an assembler directive which is similar to a the #define pre-processor directive in C language.

AREA     STACK, NOINIT, READWRITE, ALIGN=3



Next, this is a declaration of the area for Stack. This is done by the assembler directive AREA. This directive denotes a separate section in the memory. STACK in this case is just the name of the section. Following the name of the section are some attributes for this section.

NOINIT indicates that the data in this section is initialized to zero.

READWRITE as the name implies, this section is allowed to be written to and read from.

ALIGN=3 makes the starting of this section on an 8-byte boundary. (2^3 = 8).

Stack_Mem          SPACE          Stack_Size



This line allocates a space of 0x0400 bytes in the stack area. SPACE is an assembly directive which just reserves a space of specified bytes.

__initial_sp is the declaration of a label which is later used in the vector table. This label will equate to the next address after the stack space in this area. Since the stack grows downwards, this serves as the initial stack pointer.

Vector Table

Ignore the heap section for now. Let's now look at the vector table.

The vector table is in section called as RESET. This declaration of the section is denoted by line:

AREA           RESET,     DATA,     READONLY



RESET is the name of the area. DATA indicates that this section will contain data and not instructions. This is true because the vector table contains only the addresses of the handlers and initial stack pointer value.

READONLY as the name indicates protects this area from being overwritten by the program code.

This area is placed at start of the CODE section of the flash memory which is 0x08000000 for this particular device. (Refer the memory mapping of the MCU in datasheet) This value is specified in linker options - either in a scatter file or by command line linker options. So this means that the vector table is placed at offset 0. Since the vector table offset register VTOR is defaulted to 0, the processor therefore uses this vector table at startup.

The vector table contains:

  • Initial value of the Stack Pointer
  • Starting address of the reset handler i.e. the code which will be executed on reset
  • Starting addresses of all other exceptions and interrupts including the NMI handler, Hard fault handler and so on.
DCD          __initial_sp



This line stores the value of label __initial_sp in the RESET area. DCD is an assembly directive which stores a word data (32-bit) in the memory.

DCD          Reset_Handler



Similarly the next word stored is the address of Reset_Handler. This is a forward reference because the label Reset_Handler is declared somewhere down the code. (The assembler processes the file in two passes which helps it to resolve such forward references).

Following these are then the labels which are starting addresses of various handlers such as NMI_Handler, HardFault_Handler and so on. Up to SysTick_Handler are the Arm processors' exceptions. After that the table continues with External interrupts. Here 'external' refers to Arm processor and not the MCU STM32. These interrupts are connected to various peripherals in the MCU such as Watchdog, DMA, RTC etc. The list continues up to FPU_IRQHandler (Flash point Unit IRQ).

The vector table and especially the first two entries in it are essential to start the core to execute some program and handle the PUSH/POP instructions. This is because when the CortexM4 starts, it first copies the first entry in the vector table to the stack pointer (which is the Main Stack Pointer or MSP). Next it copies the next entry into PC (Program counter) and the execution starts from this address. So we specify the address of our Reset Handler which is the first code it will execute.

Reset Handler

After defining the vector table, actual code starts. This is contained in a CODE region.

AREA    |.text|, CODE, READONLY



This defines an area of memory containing code and is marked as Read-only to avoid getting overwritten by the program itself. The name of the section is .text as a convention but could be anything you wish. Vertical bars around this name are necessary because the name does not start with an alphabet. This is a requirement of the assembly directive.

In this region the code will first call a function called SystemInit which initializes the clock speed of the MCU and then calls up main() function. Thus the control is now transferred to main() function.

IMPORT   SystemInit



refers to the function SystemInit defined elsewhere in the project.

IMPORT __main



This line refers to the __main in the C library which eventually calls the main() function defined elsewhere in your project.

If you are using plain assembly, you will need to place an ENTRY directive in the reset handler in absence of the __main. This allows the linker, debugger to locate the entry point of the program.

LDR     R0, =SystemInit



is a pseudo assembly instruction which loads the address of SystemInit function in R0 and then the following instruction BLX     R0 jumps the code to execute from that address.

Similarly after control returns from SystemInit, the main() function is called.

Exception Handlers

Once the code starts executing, there might be exceptions occurring and therefore you need exception handlers. For e.g. look at the NMI handler.

NMI_Handler     PROC
                EXPORT     NMI_Handler     [WEAK]
                B     .
                ALIGN
                ENDP



The first line NMI_Handler is the label for this small function. PROC is an assembly directive which defines start of a procedure or a function.

Next line EXPORT makes this label NMI_Handler available to other parts of the program. The attribute [WEAK] is added so that the handler can be redefined elsewhere in the project. This helps you to have your own custom handler in your project and even different handlers for different projects but still keep the same startup file. This is something similar to the virtual functions in C++.

Of course if you want to have the same handler for all your projects, then this startup file can be modified to call your own function from here or add your code here itself.

By default the handlers are defined only as endless loop by the instruction B . This instruction is branching to the same address thus generating in an infinite loop.

ENDP denotes end of the procedure.

ALIGN is an assembler directive which aligns the current memory location to the next word boundary. NOP instructions (or zero data) are inserted to achieve this, if the current location is already on the boundary. It can be used to align to different boundaries and even to insert/pad specified data instead of just NOP or zero data.

This handler code is used for all the processor exceptions.

For the external interrupt handlers, the startup file just defines only one procedure (the same endless loop) Default_Handler. All the external interrupt handler labels are defined same as this Default_Handler. This means that for any exception occurring from the MCU peripherals, the code will execute this Default_Handler. Again, all these are exported as weak so you can redefine them in your project.

Note that even the Reset_Handler is also exported as weak so you can have your own reset handler if you wish.

Heap Area

The heap section is defined similar to the stack area. The two labels __heap_base and __heap_limit indicate the starting of heap area and end of the heap area respectively. If using the Arm Microlib, the labels for initial stack pointer and the start and end of heap area are just exported. Otherwise it needs to be handled differently. I am yet to explore into deep of this so will add more details later.

Miscellaneous

Two more directives in the startup file are worth mentioning.

PRESERVE8

This directive instructs the linker to preserve 8-byte alignment of the stack. This is a requirement of the Arm Architecture Procedure Call Standard (AAPCS).

THUMB

This indicates THUMB mode which is the only mode available on Cortex-M processors since it does not support the Arm mode.

I hope this information will be useful in understanding a bit of the processor and startup code.

Any comments and especially corrections are welcome.

Learn more about Cortex-M

startup_stm32f40xx.s.zip
Anonymous

Top Comments

  • Jens Bauer
    Jens Bauer over 10 years ago +1
    Nice article. I can't write so much comments, because I am using the GNU assembler myself, and it differs a bit from the one you're using (I believe that's Arm's own assembler). But one thing I will...
  • Jens Bauer
    Jens Bauer over 10 years ago +1
    I think I wrote the above comment a bit too quickly. The loop would look like this: 1:      wfi         b       1b -That will branch back to the wfi, so when we actually get an interrupt, we'll go back...
  • Yashwant Rao
    Yashwant Rao over 7 years ago

    Hello Mr. Gopal

    Thank you for writing such a good tutorial,  the topic is really worth to write and read as well.

    __initial_sp                                            // declaration of a label

    DCD          __initial_sp   // stores the value of label __initial_sp in the RESET area

    The above two statement  is  clear to me.

    My question is that what is the value of label and where it is assigned??

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Jens Bauer
    Jens Bauer over 8 years ago

    I agree that it's not easy to find information on this.

    If you have STM32CUBE, you can look in how it's done there.

    (For instance, see STM32CUBE/STM32F4xx/CMSIS/Device/Source/Templates/iar/*.s)

    For IAR, ST use ...

    SECTION .intvec:CODE:NOROOT(2)

    ... to tell the assembler where to put the vector table.

    The very first 32-bit word of this vector table is the value that the stack pointer should be initialized to.

    The second 32-bit word is the RESET handler (which will normally at some point call the main() routine).

    The next 14 32-bit words are NVIC vectors and after that, the IRQ vectors follow.

    Each microcontroller type has a different number of IRQ vectors, so the IRQ vector table for a STM32F103 can not be used on a STM32F427 for instance.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Jens Bauer
    Jens Bauer over 8 years ago

    Hi almousa and welcome to the community!

    If you just have 3 sections, TEXT, DATA and BSS, it's up to the linker to put the DATA sections together in the order that the files are linked. That means you need to hope for the best.

    Fortunately, we're not restricted to this. These days, we can write a linker-script.

    In our linker-script, we can invent new sections and we can also specify the exact order of the sections if we wish (and we do).

    One of my first documents on the Arm Connected Community was this document:

    Writing your own startup code for Cortex-M

    -If you look just below "The Exception Vector Table", you'll see that I specify the exception vectors to go in the "isr_vector" section.

    The linker script then needs to specify that the very first section to go in the beginning of Flash-memory, must be the "isr_vector" section.

    My document was written for the GNU toolchain, but if using other toolchains - like Keil's, the way it's done is very similar.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Hassan
    Hassan over 8 years ago

    Very nice and informative post , thanks a lot.

    I am yet confused about one thing , how does the assembler/linker know that it should place the DATA area named :Reset at the reset area of memory.

    Ok , the info related to the reset area address is set in the linker based on device type , but when we define a data area in the startup file , how does the linker know that this specific data area should go to the reset address ?

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Ilya
    Ilya over 9 years ago

    Все равно не до конца понятно. А что, если я пишу на ассемблере и __main и SysInit нету? Где правильно вставить ENTRY? Вообще что-то очень мало инфы в инете((

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
>
Architectures and Processors blog
  • Introducing GICv5: Scalable and secure interrupt management for Arm

    Christoffer Dall
    Christoffer Dall
    Introducing Arm GICv5: a scalable, hypervisor-free interrupt controller for modern multi-core systems with improved virtualization and real-time support.
    • April 28, 2025
  • Getting started with AARCHMRS Features.json using Python

    Joh
    Joh
    A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
    • April 8, 2025
  • Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

    Samer El-Haj-Mahmoud
    Samer El-Haj-Mahmoud
    Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
    • January 28, 2025