Useful tips to develop secure software on ARMv8-M

November 1, 2016

7 minute read time.

Arm TrustZone technology is a system-wide approach to security for system-on-chip (SoC) designs. It is hardware-based security built into the heart of CPUs and systems and used by semiconductor chip designers who want to provide security to devices, such as root of trust. TrustZone technology is available on any ARM Cortex-A based system, and now with the Cortex-M23 and Cortex-M33 processors, it is also available on the latest Cortex-M based systems (see Thomas Ensergueix blog on Cortex-M23 and Cortex-M33 - Security foundation for billions of devices). It is now possible to design in security, from the smallest microcontrollers, with TrustZone for Cortex-M processors, to high performance applications processors, with TrustZone technology for Cortex-A processors.

TrustZone security enables separation of shared resources between trusted and non-trusted

At the heart of the TrustZone approach is the concept of secure and non-secure worlds that are hardware-separated from each other. Within the processor, software either resides in the secure world or the non-secure world; a switch between these two worlds is accomplished via software in Cortex-A processors (referred to as the secure monitor) and by hardware (core logic) in Cortex-M processors. This concept of secure (trusted) and non-secure (non-trusted) worlds extends beyond the CPU. It also covers memories, on-chip bus systems, interrupts, peripheral interfaces and software within an SoC.

TrustZone technology for Armv8-M processors (Cortex-M)

The Armv8-M architecture extends TrustZone technology to Cortex-M class systems, enabling robust levels of protection at all cost points. TrustZone for Cortex-M is used to protect firmware and peripherals, as well as providing isolation for secure boot, trusted update and root-of-trust implementations. The architecture maintains the deterministic real time response expected of embedded solutions. Context-switching between secure and non-secure worlds is done in hardware for faster transitions and greater power efficiency. There is no need for any secure monitor software since the processor itself performs the switching which reduces the memory foot print and dynamic power for code execution.

Before we move to programming, there are a few concepts that we need to cover first:

Security defined by address
Additional states
Cross-domain calls

Concept number 1: Security defined by address

The first key concept to grasp is that each address is associated with a specific security state. The processor uses a newly-introduced security attribution unit to check the security state of the address. A system-level interface may override that attribution based on the overall SoC design. After the state is selected, the address also passes through the appropriate memory protection unit, if present in the system.

security by address.png

Security defined by address

Concept number 2: Additional states

The second key concept is the presence of additional execution states. The Armv7-M and Armv6-M architecture defined two execution modes: handler mode and thread mode. Handler mode is privileged and may access all resources of the SoC, while thread mode could be either privileged or unprivileged. With the TrustZone security extension, the processor modes are mirrored to form secure state and non-secure state, each has a handler mode and a thread mode. The security states and processor modes are orthogonal, resulting in four combinations of states and modes. When running software in secure memories, the processor is automatically set to secure state. Likewise, when the processor is executing software in non-secure memories, the processor is automatically set to non-secure state. This design removes the need for any secure monitor software to manage the state switch which reduces the memory foot print and power consumption as detailed above.

Additional orthogonal states

Concept number 3: Cross domain calls

Armv8-M was designed for the Cortex-M profile with deterministic real time operations. Any function in any state may call any other function in the other state directly, as long as certain rules are respected based on the predefined security state entry points. Also, as expected, each state has a distinct set of stacks and stack pointers to those stacks in order to protect assets on the secure side. The function call overhead is dramatically reduced since there is no need for an API layer to manage the calls. Given predefined entry points, the call proceeds directly to the called function.

Cross domain calls

Simplified use case

The diagram below shows one simple use case where the user application and the I/O driver are in the non-secure state, whereas the system startup code and a communication stack are in the secure state. The user application calls into the communication stack to transmit and receive data whereas that stack will use the I/O driver in the non-secure state to transmit and receive over the interface.

Example software configuration taking advantage of the TrustZone Secure stateAs in all such systems:

Non-secure applications do not have access to secure resources unless going through properly-defined secure service function entry points
Secure firmware can access both secure and non-secure memories
Secure and non-secure code may implement independent time scheduling using different timers
Each interrupt line can be programmed to be secure or non-secure. The vector tables for secure and non-secure software are also separated.

Although the processor hardware provides essential protection for the secure software, secure software needs to be written carefully to ensure that the whole system is secure. Below are three key items that software developers should remember when creating secure software:

Utilise new ARM C Language Extension (ACLE) features
Validate non-trusted pointers
Design for asynchronous non-secure memory modifications

Tip number 1: Utilize new Arm C Language Extension features

TrustZone for Armv8-M introduced a few new instructions to support the security state switching. Instead of creating assembly wrappers to generate those instructions, software developers should utilize new compiler features defined in ARM C Language Extensions (ACLE) to allow software tools to understand the security usage of the functions and generate the best code required. The ACLE features are implemented by multiple compiler vendors, and hence, the code is portable.

For example, when creating a secure API that is callable from the non-secure state, a new function attribute called “cmse_nonsecure_entry” should be used in declaring the function. At the end of the function call in the secure state, the registers inside the processor may still contain secret information. With the correct function attribute, the compiler can automatically insert code to clear registers in R0-R3, R12 and Application Program Status Register (APSR) that might contain secret information, except if the registers are used to return the result to non-secure software. Register R4 to R11 are handled differently, as their contents should be unchanged at function boundaries. If their values are changed in the middle of the function the values should be restored to original values before returns to non-secure caller.

Tip number 2: Validate non-trusted pointers

There will be cases where the non-secure code may provide incorrect pointers by design to try to gain access to secure memory. To counter that possibility, a new instruction has been introduced in ARMv8-M, the Test Target (TT) instruction. The TT instructions return the security attributes of an address, so secure software can determine if a pointer is pointing to a secure or non-secure address.

To make the checking of pointers more efficient, there is an associated region number for each memory region defined by the security configuration. This region number can be utilized by software to determine if a contiguous range of memory has similar security attributes.

The TT instruction returns the security attributes and region number (as well as MPU region number) from an address value. By using a TT instruction on the start and end addresses of the memory range, and identifying that both reside in the same region number, software can quickly determine that the memory range (e.g. data array or data structure) is located entirely in non-secure space.

Checking pointers for valid region boundaries

Using this mechanism, secure code servicing APIs in to the secure side can determine if the memory referenced by a pointer from non-secure software has the appropriate security attribute for the API. This prevents non-secure software from using APIs in secure software to read out or corrupt secure information.

Tip number 3: Design for asynchronous non-secure memory modifications

Non-secure interrupt service routines could change non-secure data that is being processed by secure software. Thus, input data that has already been validated by the secure API can be changed by a non-secure ISR after the validation step. One way to avoid that situation is to make a local copy of that input data in secure memory and use the secure copy for processing (including validation of the input data) and avoid the read to non-secure memory. In the cases where such copying is not desired (e.g. when dealing with large amount of data in a specific memory region), then the alternative is to program the security attribution unit to make that memory region secure.

Summary

It is up to developers of software for the secure side to make sure that whole system is secure and that no secure data leaks to the non-secure side. In order to accomplish this, three key TrustZone concepts were outlined along with three key items thats secure software developers can use to help create secure systems. The ACLE techniques to protect data in registers of called functions. The TT instruction for validating pointers and finally, that developers need to keep in mind that the non-secure side may change data by interrupting the secure-side. To explore further, a selection of documents are available to help in developing secure firmware on Armv8-M processors.

0 comments
0 members are here

Architectures and Processors blog

Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

Samer El-Haj-Mahmoud

Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
- January 28, 2025
Caches and Self-Modifying Code: Working with Threads

Jacob Bramley

How to synchronize JIT-compiled instructions across threads.
- January 21, 2025
Caches and Self-Modifying Code: Implementing `__clear_cache`

Jacob Bramley

How to implement `__clear_cache` using assembly.
- January 20, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog