This is my attempt to understand the startup file for an Arm Cortex M4 processor, specifically the STM32F4 (Cortex M4) processor. This document should help in giving a feel of assembly language for Arm and understanding how the Cortex M4 processor starts. Familiarity with the architecture of Cortex M4 is required to understand it better.
More importantly, I am looking forward for expert comments and corrections which will help me fill in the gaps in my knowledge.
I am not reproducing the startup code entirely here to avoid clutter. Please refer to the file uploaded. This file is part of the STMicroelectronics software pack along with KEIL MDK-Arm which means it uses the Arm assembler and not the GNU assembler.
Please ignore the line numbers appearing in the code snippets mentioned below. They do not correspond to the line numbers in the startup file.
There are 5 parts of the startup code.
The assembly code is usually divided into different sections by the AREA directive. Let's first look at how the stack area is declared.
Stack_Size EQU 0x00000400
This line declares a constant called Stack_Size of value 0x00000400. The EQU is an assembler directive which is similar to a the #define pre-processor directive in C language.
AREA STACK, NOINIT, READWRITE, ALIGN=3
Next, this is a declaration of the area for Stack. This is done by the assembler directive AREA. This directive denotes a separate section in the memory. STACK in this case is just the name of the section. Following the name of the section are some attributes for this section.
NOINIT indicates that the data in this section is initialized to zero.
READWRITE as the name implies, this section is allowed to be written to and read from.
ALIGN=3 makes the starting of this section on an 8-byte boundary. (2^3 = 8).
Stack_Mem SPACE Stack_Size
This line allocates a space of 0x0400 bytes in the stack area. SPACE is an assembly directive which just reserves a space of specified bytes.
__initial_sp is the declaration of a label which is later used in the vector table. This label will equate to the next address after the stack space in this area. Since the stack grows downwards, this serves as the initial stack pointer.
Ignore the heap section for now. Let's now look at the vector table.
The vector table is in section called as RESET. This declaration of the section is denoted by line:
AREA RESET, DATA, READONLY
RESET is the name of the area. DATA indicates that this section will contain data and not instructions. This is true because the vector table contains only the addresses of the handlers and initial stack pointer value.
READONLY as the name indicates protects this area from being overwritten by the program code.
This area is placed at start of the CODE section of the flash memory which is 0x08000000 for this particular device. (Refer the memory mapping of the MCU in datasheet) This value is specified in linker options - either in a scatter file or by command line linker options. So this means that the vector table is placed at offset 0. Since the vector table offset register VTOR is defaulted to 0, the processor therefore uses this vector table at startup.
The vector table contains:
DCD __initial_sp
This line stores the value of label __initial_sp in the RESET area. DCD is an assembly directive which stores a word data (32-bit) in the memory.
DCD Reset_Handler
Similarly the next word stored is the address of Reset_Handler. This is a forward reference because the label Reset_Handler is declared somewhere down the code. (The assembler processes the file in two passes which helps it to resolve such forward references).
Following these are then the labels which are starting addresses of various handlers such as NMI_Handler, HardFault_Handler and so on. Up to SysTick_Handler are the Arm processors' exceptions. After that the table continues with External interrupts. Here 'external' refers to Arm processor and not the MCU STM32. These interrupts are connected to various peripherals in the MCU such as Watchdog, DMA, RTC etc. The list continues up to FPU_IRQHandler (Flash point Unit IRQ).
The vector table and especially the first two entries in it are essential to start the core to execute some program and handle the PUSH/POP instructions. This is because when the CortexM4 starts, it first copies the first entry in the vector table to the stack pointer (which is the Main Stack Pointer or MSP). Next it copies the next entry into PC (Program counter) and the execution starts from this address. So we specify the address of our Reset Handler which is the first code it will execute.
After defining the vector table, actual code starts. This is contained in a CODE region.
AREA |.text|, CODE, READONLY
This defines an area of memory containing code and is marked as Read-only to avoid getting overwritten by the program itself. The name of the section is .text as a convention but could be anything you wish. Vertical bars around this name are necessary because the name does not start with an alphabet. This is a requirement of the assembly directive.
In this region the code will first call a function called SystemInit which initializes the clock speed of the MCU and then calls up main() function. Thus the control is now transferred to main() function.
IMPORT SystemInit
refers to the function SystemInit defined elsewhere in the project.
IMPORT __main
This line refers to the __main in the C library which eventually calls the main() function defined elsewhere in your project.
If you are using plain assembly, you will need to place an ENTRY directive in the reset handler in absence of the __main. This allows the linker, debugger to locate the entry point of the program.
LDR R0, =SystemInit
is a pseudo assembly instruction which loads the address of SystemInit function in R0 and then the following instruction BLX R0 jumps the code to execute from that address.
Similarly after control returns from SystemInit, the main() function is called.
Once the code starts executing, there might be exceptions occurring and therefore you need exception handlers. For e.g. look at the NMI handler.
NMI_Handler PROC EXPORT NMI_Handler [WEAK] B . ALIGN ENDP
The first line NMI_Handler is the label for this small function. PROC is an assembly directive which defines start of a procedure or a function.
Next line EXPORT makes this label NMI_Handler available to other parts of the program. The attribute [WEAK] is added so that the handler can be redefined elsewhere in the project. This helps you to have your own custom handler in your project and even different handlers for different projects but still keep the same startup file. This is something similar to the virtual functions in C++.
Of course if you want to have the same handler for all your projects, then this startup file can be modified to call your own function from here or add your code here itself.
By default the handlers are defined only as endless loop by the instruction B . This instruction is branching to the same address thus generating in an infinite loop.
ENDP denotes end of the procedure.
ALIGN is an assembler directive which aligns the current memory location to the next word boundary. NOP instructions (or zero data) are inserted to achieve this, if the current location is already on the boundary. It can be used to align to different boundaries and even to insert/pad specified data instead of just NOP or zero data.
This handler code is used for all the processor exceptions.
For the external interrupt handlers, the startup file just defines only one procedure (the same endless loop) Default_Handler. All the external interrupt handler labels are defined same as this Default_Handler. This means that for any exception occurring from the MCU peripherals, the code will execute this Default_Handler. Again, all these are exported as weak so you can redefine them in your project.
Note that even the Reset_Handler is also exported as weak so you can have your own reset handler if you wish.
The heap section is defined similar to the stack area. The two labels __heap_base and __heap_limit indicate the starting of heap area and end of the heap area respectively. If using the Arm Microlib, the labels for initial stack pointer and the start and end of heap area are just exported. Otherwise it needs to be handled differently. I am yet to explore into deep of this so will add more details later.
Two more directives in the startup file are worth mentioning.
PRESERVE8
This directive instructs the linker to preserve 8-byte alignment of the stack. This is a requirement of the Arm Architecture Procedure Call Standard (AAPCS).
THUMB
This indicates THUMB mode which is the only mode available on Cortex-M processors since it does not support the Arm mode.
I hope this information will be useful in understanding a bit of the processor and startup code.
Any comments and especially corrections are welcome.
[CTAToken URL = "https://developer.arm.com/products/processors/cortex-m" target="_blank" text="Learn more about Cortex-M" class ="green"]
Thanks jensbauer for your comments.
Yes it is Arm assembler used with KEIL. I will add this information.
However, this is not my code. It is supplied by ST Micro and looks more as a starting point only. I tried to decode it just in order to understand and learn.
That said, I will keep in mind your suggestion about indefinite loop with branch and replacing it with WFI when I enter the development phase.
Regards,
Gopal
Nice article. I can't write so much comments, because I am using the GNU assembler myself, and it differs a bit from the one you're using (I believe that's Arm's own assembler).
But one thing I will recommend, is that your 'endless loop' is changed to branch back to a WFI.
The line "b ." will consume a lot of power. WFI (Wait For Interrupt) will sleep the CPU until an interrupt arrives.
You can try measuring the difference in current usage for your device.