Future Architecture Technologies are features being developed for currently unreleased versions of the Arm architecture. Arm provides the ecosystem with relevant information and specifications in advance, ensuring software support for when new technologies are realized in hardware. This blog introduces two future technologies:
Information on new instructions and registers are already available, fixed virtual platforms and full specifications will become available in the coming months.
Modern applications are often composed of code from multiple sources, for example:
There may be a need provide isolation (sand boxing) between different components of the application. One way to achieve this isolation is to decompose the application into multiple processes, with inter-process calls (IPC) allowing the components to communicate with one another.
However, using multiple processes to provide isolation comes at a cost. IPCs introduce a performance overhead, and each extra process has a memory footprint associated with it.
POE2 aims to provide comparable isolation within a single process. Removing the overhead associated with IPCs and the additional memory costs.
POE2 builds on the existing permission model in the Arm architecture, adding additional spatial and temporal controls.
Each page of virtual memory is labelled with a “permissions index”. Permissions are applied based on the permissions index:
These relationships are programmed by the kernel, in in-memory tables, and applied by hardware.
The following diagram shows a simplified example:
The region Data is allocated with a base permission of read/write. Different overlays are applied to Data’s base permission depending on where it is accessed from and what context is running. Code 1 sees the full base permission, however, the overlay applied when Code 2 executes restricts the permission to read-only.
Overlays are stored in two-dimensional tables, with separate tables for instruction (IRT) and data (DPOT) permissions. This is illustrated below:
The IRT is indexed using permission index for the instruction’s address (spatial) and the value of the TINDEX_ELx register (temporal). The table returns the execution permission overlay, and the POTIndex. If the instruction accesses memory, the DPOT is indexed using the permission index for the data address (spatial) and the POTIndex from the IRT lookup (temporal).
As well as providing an overlay to traditional permissions, the IRT provides a FGDTIndex. This allows POE2 to restrict which classes of an instruction are permitted to be executed. For example, prohibiting system calls.
These overlays allow the same location to be seen with different permissions depending on where it is accessed from and what context it is executing from.
Let’s now apply POE2 to a simplified real-world example; a web browser executing JITed code from a web site.
Typically, the browser will want to sandbox the JITed code as it comes from a potentially untrusted source. Today, the browser could provide that isolation by using multiple processes. POE2 allows us to achieve the same result within a single process.
In this example, a region of memory is allocated to store the JIT output. That location needs to have read/write permissions to the JIT compiler. However, when the JIT output is executing, the same region should appear to be read-only and executable. We could have achieved the same result by modifying the base permission post running the JIT compiler, however that would typically require a system call and Translation Lookaside Buffer (TLB) invalidation which can be expensive. POE2 enables the effective permission to vary without the need for the system call.
Similarly, there is memory allocated to store browser state. The statically compiled browser code is permitted to read and write this state. However, when the JIT output is running, we want to limit which portions of the state are visible and restrict access to be read-only. POE2 overlays permit different overlays to be applied based on what is currently executing.
The JITed code might call into common libraries or frameworks. POE2 allows the overlay, and hence the effective permission, to vary according to where the library was called from.
Finally in this example, we have prohibited the JITed code from making any system calls to the OS.
Arm introduced the Memory Tagging Extension (MTE) in Armv8.5-A, which enables developers to efficiently detect memory safety violations. MTE has support in both Android and iOS.
The Virtual Memory Tagging Extension builds on this foundation, but changes how tag storage is allocated and managed. Virtual MTE brings the benefits of MTE to systems where the OS cannot manage fixed memory carve outs, or servers where the carve out is too expensive
MTE works by associating a tag with memory locations and pointers. When a location is accessed, the tag in pointer is compared against the tag associated with the location. If the tags match, the access is permitted. If the tags do not match, a potential memory safety violation has been detected.
MTE allows efficient detection of memory safety violations such as use-after-free and buffer overruns. With controls to specify which accesses are checked and how tag check failures are reported.
In the original MTE, tags are associated with a physical location. While there are several different possible implementation styles, typically MTE requires that the tag storage is allocated upfront, for example as a memory carveout.
The Virtual Tagging Extension changes this so that the tag is associated with the virtual address instead. The consequence of this is that tag storage is also virtually allocated, with storage only needed for those pages which are tagged, giving a more scalable deployment
The diagram below illustrates the difference between physical and virtual tagging:
The programmer’s model seen by applications remains the same whether using physical or virtual tagging, allowing applications to be agnostic. Software that is involved in allocating and managing memory does need to be aware of which type of MTE is implemented on the platform.
This blog provides a brief summary of two future architecture technologies, POE2 and vMTE. We continue to work with our ecosystem to ensure these features are quickly enabled in software and available in future processors.
For deeper technical details on POE2 and vMTE, visit our Arm Developer website.