Skip to content

Commit df9d7a2

Browse files
fvincenzoctmarinas
authored andcommitted
arm64: mte: Add Memory Tagging Extension documentation
Memory Tagging Extension (part of the ARMv8.5 Extensions) provides a mechanism to detect the sources of memory related errors which may be vulnerable to exploitation, including bounds violations, use-after-free, use-after-return, use-out-of-scope and use before initialization errors. Add Memory Tagging Extension documentation for the arm64 linux kernel support. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Cc: Will Deacon <will@kernel.org>
1 parent 89b94df commit df9d7a2

4 files changed

Lines changed: 312 additions & 0 deletions

File tree

Documentation/arm64/cpu-feature-registers.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,8 @@ infrastructure:
175175
+------------------------------+---------+---------+
176176
| Name | bits | visible |
177177
+------------------------------+---------+---------+
178+
| MTE | [11-8] | y |
179+
+------------------------------+---------+---------+
178180
| SSBS | [7-4] | y |
179181
+------------------------------+---------+---------+
180182
| BT | [3-0] | y |

Documentation/arm64/elf_hwcaps.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,10 @@ HWCAP2_BTI
240240

241241
Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.
242242

243+
HWCAP2_MTE
244+
245+
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
246+
by Documentation/arm64/memory-tagging-extension.rst.
243247

244248
4. Unused AT_HWCAP bits
245249
-----------------------

Documentation/arm64/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ ARM64 Architecture
1414
hugetlbpage
1515
legacy_instructions
1616
memory
17+
memory-tagging-extension
1718
perf
1819
pointer-authentication
1920
silicon-errata
Lines changed: 305 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,305 @@
1+
===============================================
2+
Memory Tagging Extension (MTE) in AArch64 Linux
3+
===============================================
4+
5+
Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
6+
Catalin Marinas <catalin.marinas@arm.com>
7+
8+
Date: 2020-02-25
9+
10+
This document describes the provision of the Memory Tagging Extension
11+
functionality in AArch64 Linux.
12+
13+
Introduction
14+
============
15+
16+
ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
17+
feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
18+
(Top Byte Ignore) feature and allows software to access a 4-bit
19+
allocation tag for each 16-byte granule in the physical address space.
20+
Such memory range must be mapped with the Normal-Tagged memory
21+
attribute. A logical tag is derived from bits 59-56 of the virtual
22+
address used for the memory access. A CPU with MTE enabled will compare
23+
the logical tag against the allocation tag and potentially raise an
24+
exception on mismatch, subject to system registers configuration.
25+
26+
Userspace Support
27+
=================
28+
29+
When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
30+
supported by the hardware, the kernel advertises the feature to
31+
userspace via ``HWCAP2_MTE``.
32+
33+
PROT_MTE
34+
--------
35+
36+
To access the allocation tags, a user process must enable the Tagged
37+
memory attribute on an address range using a new ``prot`` flag for
38+
``mmap()`` and ``mprotect()``:
39+
40+
``PROT_MTE`` - Pages allow access to the MTE allocation tags.
41+
42+
The allocation tag is set to 0 when such pages are first mapped in the
43+
user address space and preserved on copy-on-write. ``MAP_SHARED`` is
44+
supported and the allocation tags can be shared between processes.
45+
46+
**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
47+
RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
48+
types of mapping will result in ``-EINVAL`` returned by these system
49+
calls.
50+
51+
**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
52+
be cleared by ``mprotect()``.
53+
54+
**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
55+
``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
56+
point after the system call.
57+
58+
Tag Check Faults
59+
----------------
60+
61+
When ``PROT_MTE`` is enabled on an address range and a mismatch between
62+
the logical and allocation tags occurs on access, there are three
63+
configurable behaviours:
64+
65+
- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
66+
tag check fault.
67+
68+
- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
69+
``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
70+
memory access is not performed. If ``SIGSEGV`` is ignored or blocked
71+
by the offending thread, the containing process is terminated with a
72+
``coredump``.
73+
74+
- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
75+
thread, asynchronously following one or multiple tag check faults,
76+
with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
77+
address is unknown).
78+
79+
The user can select the above modes, per thread, using the
80+
``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
81+
``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
82+
bit-field:
83+
84+
- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults
85+
- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode
86+
- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
87+
88+
The current tag check fault mode can be read using the
89+
``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
90+
91+
Tag checking can also be disabled for a user thread by setting the
92+
``PSTATE.TCO`` bit with ``MSR TCO, #1``.
93+
94+
**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
95+
irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
96+
``sigreturn()``.
97+
98+
**Note**: There are no *match-all* logical tags available for user
99+
applications.
100+
101+
**Note**: Kernel accesses to the user address space (e.g. ``read()``
102+
system call) are not checked if the user thread tag checking mode is
103+
``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
104+
``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
105+
address accesses, however it cannot always guarantee it.
106+
107+
Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
108+
-----------------------------------------------------------------
109+
110+
The architecture allows excluding certain tags to be randomly generated
111+
via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
112+
excludes all tags other than 0. A user thread can enable specific tags
113+
in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
114+
flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
115+
in the ``PR_MTE_TAG_MASK`` bit-field.
116+
117+
**Note**: The hardware uses an exclude mask but the ``prctl()``
118+
interface provides an include mask. An include mask of ``0`` (exclusion
119+
mask ``0xffff``) results in the CPU always generating tag ``0``.
120+
121+
Initial process state
122+
---------------------
123+
124+
On ``execve()``, the new process has the following configuration:
125+
126+
- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
127+
- Tag checking mode set to ``PR_MTE_TCF_NONE``
128+
- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
129+
- ``PSTATE.TCO`` set to 0
130+
- ``PROT_MTE`` not set on any of the initial memory maps
131+
132+
On ``fork()``, the new process inherits the parent's configuration and
133+
memory map attributes with the exception of the ``madvise()`` ranges
134+
with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
135+
to 0).
136+
137+
The ``ptrace()`` interface
138+
--------------------------
139+
140+
``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
141+
the tags from or set the tags to a tracee's address space. The
142+
``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
143+
data)`` where:
144+
145+
- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_PEEKMTETAGS``.
146+
- ``pid`` - the tracee's PID.
147+
- ``addr`` - address in the tracee's address space.
148+
- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
149+
a buffer of ``iov_len`` length in the tracer's address space.
150+
151+
The tags in the tracer's ``iov_base`` buffer are represented as one
152+
4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
153+
tracee's address space.
154+
155+
**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
156+
will use the corresponding aligned address.
157+
158+
``ptrace()`` return value:
159+
160+
- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
161+
number of tags transferred. This may be smaller than the requested
162+
``iov_len`` if the requested address range in the tracee's or the
163+
tracer's space cannot be accessed or does not have valid tags.
164+
- ``-EPERM`` - the specified process cannot be traced.
165+
- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
166+
address) and no tags copied. ``iov_len`` not updated.
167+
- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
168+
or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
169+
- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
170+
mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.
171+
172+
**Note**: There are no transient errors for the requests above, so user
173+
programs should not retry in case of a non-zero system call return.
174+
175+
``PTRACE_GETREGSET`` and ``PTRACE_SETREGSET`` with ``addr ==
176+
``NT_ARM_TAGGED_ADDR_CTRL`` allow ``ptrace()`` access to the tagged
177+
address ABI control and MTE configuration of a process as per the
178+
``prctl()`` options described in
179+
Documentation/arm64/tagged-address-abi.rst and above. The corresponding
180+
``regset`` is 1 element of 8 bytes (``sizeof(long))``).
181+
182+
Example of correct usage
183+
========================
184+
185+
*MTE Example code*
186+
187+
.. code-block:: c
188+
189+
/*
190+
* To be compiled with -march=armv8.5-a+memtag
191+
*/
192+
#include <errno.h>
193+
#include <stdint.h>
194+
#include <stdio.h>
195+
#include <stdlib.h>
196+
#include <unistd.h>
197+
#include <sys/auxv.h>
198+
#include <sys/mman.h>
199+
#include <sys/prctl.h>
200+
201+
/*
202+
* From arch/arm64/include/uapi/asm/hwcap.h
203+
*/
204+
#define HWCAP2_MTE (1 << 18)
205+
206+
/*
207+
* From arch/arm64/include/uapi/asm/mman.h
208+
*/
209+
#define PROT_MTE 0x20
210+
211+
/*
212+
* From include/uapi/linux/prctl.h
213+
*/
214+
#define PR_SET_TAGGED_ADDR_CTRL 55
215+
#define PR_GET_TAGGED_ADDR_CTRL 56
216+
# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
217+
# define PR_MTE_TCF_SHIFT 1
218+
# define PR_MTE_TCF_NONE (0UL << PR_MTE_TCF_SHIFT)
219+
# define PR_MTE_TCF_SYNC (1UL << PR_MTE_TCF_SHIFT)
220+
# define PR_MTE_TCF_ASYNC (2UL << PR_MTE_TCF_SHIFT)
221+
# define PR_MTE_TCF_MASK (3UL << PR_MTE_TCF_SHIFT)
222+
# define PR_MTE_TAG_SHIFT 3
223+
# define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT)
224+
225+
/*
226+
* Insert a random logical tag into the given pointer.
227+
*/
228+
#define insert_random_tag(ptr) ({ \
229+
uint64_t __val; \
230+
asm("irg %0, %1" : "=r" (__val) : "r" (ptr)); \
231+
__val; \
232+
})
233+
234+
/*
235+
* Set the allocation tag on the destination address.
236+
*/
237+
#define set_tag(tagged_addr) do { \
238+
asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
239+
} while (0)
240+
241+
int main()
242+
{
243+
unsigned char *a;
244+
unsigned long page_sz = sysconf(_SC_PAGESIZE);
245+
unsigned long hwcap2 = getauxval(AT_HWCAP2);
246+
247+
/* check if MTE is present */
248+
if (!(hwcap2 & HWCAP2_MTE))
249+
return EXIT_FAILURE;
250+
251+
/*
252+
* Enable the tagged address ABI, synchronous MTE tag check faults and
253+
* allow all non-zero tags in the randomly generated set.
254+
*/
255+
if (prctl(PR_SET_TAGGED_ADDR_CTRL,
256+
PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
257+
0, 0, 0)) {
258+
perror("prctl() failed");
259+
return EXIT_FAILURE;
260+
}
261+
262+
a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
263+
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
264+
if (a == MAP_FAILED) {
265+
perror("mmap() failed");
266+
return EXIT_FAILURE;
267+
}
268+
269+
/*
270+
* Enable MTE on the above anonymous mmap. The flag could be passed
271+
* directly to mmap() and skip this step.
272+
*/
273+
if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
274+
perror("mprotect() failed");
275+
return EXIT_FAILURE;
276+
}
277+
278+
/* access with the default tag (0) */
279+
a[0] = 1;
280+
a[1] = 2;
281+
282+
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
283+
284+
/* set the logical and allocation tags */
285+
a = (unsigned char *)insert_random_tag(a);
286+
set_tag(a);
287+
288+
printf("%p\n", a);
289+
290+
/* non-zero tag access */
291+
a[0] = 3;
292+
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
293+
294+
/*
295+
* If MTE is enabled correctly the next instruction will generate an
296+
* exception.
297+
*/
298+
printf("Expecting SIGSEGV...\n");
299+
a[16] = 0xdd;
300+
301+
/* this should not be printed in the PR_MTE_TCF_SYNC mode */
302+
printf("...haven't got one\n");
303+
304+
return EXIT_FAILURE;
305+
}

0 commit comments

Comments
 (0)