Subtle memory bugs, including buffer overruns and pointer errors, create ticking time bombs inside your applications. Malicious actors can exploit these bugs to execute unauthorized code, take over systems to add them to malware botnets, or simply cause applications and systems to crash. The notorious Morris Worm of 1988 was one of the earliest examples of a malicious application exploiting a buffer overflow. Announcements of memory safety issues creating potential exploits arrive with alarming frequency, either from security researchers or found loose in the wild.

The impact on users can be substantial. Rogue applications can take advantage of unsafe memory in order to gain access to sniff out sensitive data, such as user credentials and passwords, enabling access to higher levels of privilege in the system. This allows bad actors to gain access to confidential data or make the system part of a larger botnet. It’s not always outside forces that cause problems – sometimes unsafe memory results in unpredictable system crashes due to memory leaks and related issues, frustrating users. It’s estimated that two-thirds of all Android vulnerabilities happen due to unsafe memory practices.

Arm Memory Tagging Extension

Software-based solutions, including Address Sanitizer (Asan), help mitigate these memory issues by integrating memory corruption detection into modern compilers. However, Asan requires adding software instrumentation to application code, which can significantly slow down app runtime and increase memory usage, particularly problematic in mobile and embedded systems.

What’s needed is a solution to detect and minimize memory bugs with minimal impact on performance and memory use. Properly implementing a hardware-based method for detecting potentially unsafe memory usage results in smaller memory usage and better performance, while improving system reliability and security.

Arm introduced its memory tagging extension as a part of the Armv8.5 instruction set. MTE is now built into Armv9 compliant CPUs recently announced by Arm, such as the Cortex-X2, Cortex-A710, and Cortex-A510. Future CPUs based on Armv9 will also integrate MTE. These all include memory tagging as a basic part of the architecture.

The idea behind memory tagging is pretty simple: add a small set of bits to chunks of memory to identify them as safe for application usage. Arm implements memory tagging as a two-phase system, known as the lock and the key:

  • Address tagging. This adds four bits to the top of every pointer in the process. Address tagging only works with 64-bit applications since it uses top-byte-ignore, which is an Arm 64-bit feature. Address tags act as a virtual “key.”
  • Memory tagging. Memory tags also consist of four bits, but are linked with every aligned 16-byte region in the application’s memory space. Arm refers to these 16-byte regions as tag granules. These four bits aren’t used for application data and are stored separately. The memory tag is the “lock”.

A virtual address tag (key) must match the memory tag (lock). Otherwise, an error occurs.


Figure 1. Shows an example of lock and key access to memory

Since the address tag must match the memory tag, the first thing you might notice is that 4-bits is only 16 variations. This makes MTE a stochastic process, which means that it is possible for a key to incorrectly match up to a different lock. The likelihood of this happening is less than 8%, according to Arm.

Since address and memory tags are created and destroyed on the fly frequently, memory allocation units work to make sure that sequential memory tags always differ. MTE supports random tag generation as well. The combination of the memory allocator understanding that sequential tags must be different plus the random tag generation feature means the actual frequency of tag clashes is quite low. Furthermore, running MTE across a fleet of millions (or billions) of devices can provide robust error detection for system and application software.

Underlying Architecture

Armv8.5 and v9 implement a new memory type, which Arm dubs Normal Tagged Memory. The CPU can determine the safety of a memory access, by comparing an address tag to the corresponding memory tag. Developers can choose whether or not a tag mismatch results in a synchronous exception or reported asynchronously, which allows the application to continue. Figure 2 shows how MTE is implemented in ARM CPU designs.


Figure 2. Arm Total Compute Solution (Armv9)

Asynchronous mismatch details accumulate in a system register. This means the OS can isolate mismatches to specific execution threads and make decisions based on ongoing operations.

Synchronous exceptions can directly identify the specific load or store instruction causing tag mismatches. Arm added a variety of new instructions to the instruction set to manipulate tags, handle pointer and stack tagging, and for low-level system use.

Implementing Arm MTE

MTE is handled in hardware; load and store instructions have been modified to verify that the address tag matches the memory tag, and hardware memory allocation ensures the randomization of address and memory tag creation. This has differing implications for OS developers and end-user application programmers.

Arm enhanced its AMBA 5 coherent interconnect to support MTE. Tag check logic is typically built into the system-level cache, with tag checking and tag caching occurring ahead of the DRAM interface. Figure 3 shows an example block diagram.


Figure 3: Example block diagram showing how MTE might be implemented in an SoC design. (Source: Arm)

Operating systems must be modified in order to fully support MTE. Arm initially prototyped MTE by creating a version of the Linux kernel which implemented tags. Google has expressed its intent to add MTE to Android and is working with SoC developers to ensure compatibility.

End-user application developers have it a bit easier assuming operating system support for MTE. Since MTE occurs behind the scenes in the OS and hardware, applications require no source code modifications. MTE tagging for heap memory requires no extra effort. However, tagging memory on existing runtimes using stack memory requires compiler support, so existing binaries need to be recompiled. This is straightforward since mobile app developers frequently push out updates anyway. Figure 4 shows the software development timeline when implementing MTE.


Figure 4:  Software development timeline with MTE

Ensuring memory is protected may require aligning memory objects to the Tag Granule (16-byte alignment). This can increase stack and memory utilization, though the impact seems to be fairly minimal.

Why Use Arm MTE?

MTE offers several quality-of-life improvements for developers. MTE allows programmers to find memory-related bugs quickly, speeding up the application debugging and development process. Since memory bugs can be found and quashed sooner, issues such as memory leaks, memory race conditions, and other memory-related crashes become more infrequent. This in turn improves the end-user experience.

Memory safety bugs account for about two-thirds of all common vulnerabilities and exposure (CVE) bugs, so MTE allows companies to ship applications faster with fewer bugs. End users may often be reluctant to upgrade to new hardware or operating system software, but MTE gives them tangible reasons to upgrade, including improved stability and overall security.

Further Information

You can find more detailed information on Arm’s memory tagging extensions in a variety of sources.

Log in

Don't have an account? Sign up now