This is my attempt to understand the startup file for an Arm Cortex M4 processor, specifically the STM32F4 (Cortex M4) processor. This document should help in giving a feel of assembly language for Arm and understanding how the Cortex M4 processor starts. Familiarity with the architecture of Cortex M4 is required to understand it better.
More importantly, I am looking forward for expert comments and corrections which will help me fill in the gaps in my knowledge.
I am not reproducing the startup code entirely here to avoid clutter. Please refer to the file uploaded. This file is part of the STMicroelectronics software pack along with KEIL MDK-Arm which means it uses the Arm assembler and not the GNU assembler.
Please ignore the line numbers appearing in the code snippets mentioned below. They do not correspond to the line numbers in the startup file.
There are 5 parts of the startup code.
The assembly code is usually divided into different sections by the AREA directive. Let's first look at how the stack area is declared.
Stack_Size EQU 0x00000400
This line declares a constant called Stack_Size of value 0x00000400. The EQU is an assembler directive which is similar to a the #define pre-processor directive in C language.
AREA STACK, NOINIT, READWRITE, ALIGN=3
Next, this is a declaration of the area for Stack. This is done by the assembler directive AREA. This directive denotes a separate section in the memory. STACK in this case is just the name of the section. Following the name of the section are some attributes for this section.
NOINIT indicates that the data in this section is initialized to zero.
READWRITE as the name implies, this section is allowed to be written to and read from.
ALIGN=3 makes the starting of this section on an 8-byte boundary. (2^3 = 8).
Stack_Mem SPACE Stack_Size
This line allocates a space of 0x0400 bytes in the stack area. SPACE is an assembly directive which just reserves a space of specified bytes.
__initial_sp is the declaration of a label which is later used in the vector table. This label will equate to the next address after the stack space in this area. Since the stack grows downwards, this serves as the initial stack pointer.
Ignore the heap section for now. Let's now look at the vector table.
The vector table is in section called as RESET. This declaration of the section is denoted by line:
AREA RESET, DATA, READONLY
RESET is the name of the area. DATA indicates that this section will contain data and not instructions. This is true because the vector table contains only the addresses of the handlers and initial stack pointer value.
READONLY as the name indicates protects this area from being overwritten by the program code.
This area is placed at start of the CODE section of the flash memory which is 0x08000000 for this particular device. (Refer the memory mapping of the MCU in datasheet) This value is specified in linker options - either in a scatter file or by command line linker options. So this means that the vector table is placed at offset 0. Since the vector table offset register VTOR is defaulted to 0, the processor therefore uses this vector table at startup.
The vector table contains:
This line stores the value of label __initial_sp in the RESET area. DCD is an assembly directive which stores a word data (32-bit) in the memory.
Similarly the next word stored is the address of Reset_Handler. This is a forward reference because the label Reset_Handler is declared somewhere down the code. (The assembler processes the file in two passes which helps it to resolve such forward references).
Following these are then the labels which are starting addresses of various handlers such as NMI_Handler, HardFault_Handler and so on. Up to SysTick_Handler are the Arm processors' exceptions. After that the table continues with External interrupts. Here 'external' refers to Arm processor and not the MCU STM32. These interrupts are connected to various peripherals in the MCU such as Watchdog, DMA, RTC etc. The list continues up to FPU_IRQHandler (Flash point Unit IRQ).
The vector table and especially the first two entries in it are essential to start the core to execute some program and handle the PUSH/POP instructions. This is because when the CortexM4 starts, it first copies the first entry in the vector table to the stack pointer (which is the Main Stack Pointer or MSP). Next it copies the next entry into PC (Program counter) and the execution starts from this address. So we specify the address of our Reset Handler which is the first code it will execute.
After defining the vector table, actual code starts. This is contained in a CODE region.
AREA |.text|, CODE, READONLY
This defines an area of memory containing code and is marked as Read-only to avoid getting overwritten by the program itself. The name of the section is .text as a convention but could be anything you wish. Vertical bars around this name are necessary because the name does not start with an alphabet. This is a requirement of the assembly directive.
In this region the code will first call a function called SystemInit which initializes the clock speed of the MCU and then calls up main() function. Thus the control is now transferred to main() function.
refers to the function SystemInit defined elsewhere in the project.
This line refers to the __main in the C library which eventually calls the main() function defined elsewhere in your project.
If you are using plain assembly, you will need to place an ENTRY directive in the reset handler in absence of the __main. This allows the linker, debugger to locate the entry point of the program.
LDR R0, =SystemInit
is a pseudo assembly instruction which loads the address of SystemInit function in R0 and then the following instruction BLX R0 jumps the code to execute from that address.
Similarly after control returns from SystemInit, the main() function is called.
Once the code starts executing, there might be exceptions occurring and therefore you need exception handlers. For e.g. look at the NMI handler.
EXPORT NMI_Handler [WEAK]
The first line NMI_Handler is the label for this small function. PROC is an assembly directive which defines start of a procedure or a function.
Next line EXPORT makes this label NMI_Handler available to other parts of the program. The attribute [WEAK] is added so that the handler can be redefined elsewhere in the project. This helps you to have your own custom handler in your project and even different handlers for different projects but still keep the same startup file. This is something similar to the virtual functions in C++.
Of course if you want to have the same handler for all your projects, then this startup file can be modified to call your own function from here or add your code here itself.
By default the handlers are defined only as endless loop by the instruction B . This instruction is branching to the same address thus generating in an infinite loop.
ENDP denotes end of the procedure.
ALIGN is an assembler directive which aligns the current memory location to the next word boundary. NOP instructions (or zero data) are inserted to achieve this, if the current location is already on the boundary. It can be used to align to different boundaries and even to insert/pad specified data instead of just NOP or zero data.
This handler code is used for all the processor exceptions.
For the external interrupt handlers, the startup file just defines only one procedure (the same endless loop) Default_Handler. All the external interrupt handler labels are defined same as this Default_Handler. This means that for any exception occurring from the MCU peripherals, the code will execute this Default_Handler. Again, all these are exported as weak so you can redefine them in your project.
Note that even the Reset_Handler is also exported as weak so you can have your own reset handler if you wish.
The heap section is defined similar to the stack area. The two labels __heap_base and __heap_limit indicate the starting of heap area and end of the heap area respectively. If using the Arm Microlib, the labels for initial stack pointer and the start and end of heap area are just exported. Otherwise it needs to be handled differently. I am yet to explore into deep of this so will add more details later.
Two more directives in the startup file are worth mentioning.
This directive instructs the linker to preserve 8-byte alignment of the stack. This is a requirement of the Arm Architecture Procedure Call Standard (AAPCS).
This indicates THUMB mode which is the only mode available on Cortex-M processors since it does not support the Arm mode.
I hope this information will be useful in understanding a bit of the processor and startup code.
Any comments and especially corrections are welcome.
Learn more about Cortex-M
Hello Mr. Gopal
Thank you for writing such a good tutorial, the topic is really worth to write and read as well.
__initial_sp // declaration of a label
DCD __initial_sp // stores the value of label __initial_sp in the RESET area
The above two statement is clear to me.
My question is that what is the value of label and where it is assigned??
I agree that it's not easy to find information on this.
If you have STM32CUBE, you can look in how it's done there.
(For instance, see STM32CUBE/STM32F4xx/CMSIS/Device/Source/Templates/iar/*.s)
For IAR, ST use ...
... to tell the assembler where to put the vector table.
The very first 32-bit word of this vector table is the value that the stack pointer should be initialized to.
The second 32-bit word is the RESET handler (which will normally at some point call the main() routine).
The next 14 32-bit words are NVIC vectors and after that, the IRQ vectors follow.
Each microcontroller type has a different number of IRQ vectors, so the IRQ vector table for a STM32F103 can not be used on a STM32F427 for instance.
Hi almousa and welcome to the community!
If you just have 3 sections, TEXT, DATA and BSS, it's up to the linker to put the DATA sections together in the order that the files are linked. That means you need to hope for the best.
Fortunately, we're not restricted to this. These days, we can write a linker-script.
In our linker-script, we can invent new sections and we can also specify the exact order of the sections if we wish (and we do).
One of my first documents on the Arm Connected Community was this document:
Writing your own startup code for Cortex-M
-If you look just below "The Exception Vector Table", you'll see that I specify the exception vectors to go in the "isr_vector" section.
The linker script then needs to specify that the very first section to go in the beginning of Flash-memory, must be the "isr_vector" section.
My document was written for the GNU toolchain, but if using other toolchains - like Keil's, the way it's done is very similar.
Very nice and informative post , thanks a lot.
I am yet confused about one thing , how does the assembler/linker know that it should place the DATA area named :Reset at the reset area of memory.
Ok , the info related to the reset area address is set in the linker based on device type , but when we define a data area in the startup file , how does the linker know that this specific data area should go to the reset address ?
Все равно не до конца понятно. А что, если я пишу на ассемблере и __main и SysInit нету? Где правильно вставить ENTRY? Вообще что-то очень мало инфы в инете((