[PATCH v4 1/2] <sys/tagged-address.h>: An API for tagged address

Florian Weimer fweimer@redhat.com
Wed Apr 21 06:36:43 GMT 2021


* H. J. Lu:

> diff --git a/manual/tagged-address.texi b/manual/tagged-address.texi
> new file mode 100644
> index 0000000000..ce10f7e752
> --- /dev/null
> +++ b/manual/tagged-address.texi
> @@ -0,0 +1,59 @@
> +@node Tagged Address, Character Handling, Memory, Top
> +@c %MENU% Tagged address functions and macros
> +@chapter Tagged Address
> +
> +By default, the number of the address bits used in address translation
> +is the number of address bits.  But it can be changed by ARM Top-byte
> +Ignore (TBI) or Intel Linear Address Masking (LAM).
> +
> +@Theglibc{} provides several functions and macros in the header file
> +@file{sys/tagged-address.h} to manipulate tagged address bits, which is
> +the number of the address bits used in address translation.
> +@pindex sys/tagged-address.h

I don't under stand the “which is the number of address bits” part.

This section needs to describe under which circumstances it is valid to
alter the tag bits in pointers returned from glibc functions (including
system call wrappers).  I think at least historically, the kernel
required masking tag bits in user space for TBI.

> +@deftypefun {unsigned int} get_tagged_address_bits (void)
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Get the current address bits used in address translation.  The return
> +value is @code{0} if tag bits are not the highest bits in address.
> +@end deftypefun

“in addresses”?

What is the return value if there are no tag bits available?  The word
width?

> +@deftypefun uintptr_t get_tagged_address_mask (void)
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Get the current mask for address bits used in address translation.
> +@end deftypefun

Mask is ambiguous in this context.  If a bit is set in the return value,
will this bit take part in address translation or not?  Please be
explicit here.

> +@deftypefun int set_tagged_address_mask (uintptr_t @var{mask})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Set the mask for address bits used in address translation to @var{mask}.
> +The return value is @code{0} on success and @code{-1} on failure.  This
> +function can be called only once before @code{main}.  The possible
> +@code{errno} error conditions are @code{ENODEV}, @code{EPERM},
> +@code{EINVAL}, and @code{ENOSYS}.
> +@end deftypefun

Likewise, please clarify if bits set in MASK participate in address
translation or not.

Why before main?  Do you mean it can only be called once per process?

I think this limitation suggests we should use ELF markup for this.
There are definitely compatibility issues to work out here.

Historically, the x86-64 psABI supplement implied that the top 16 bits
are available for application use (without hardware masking obviously).
If e.g. malloc starts returning tagged addresses, that assumption
breaks.

Should glibc allocate tag bits to different libraries within the same
process?  For example, so that malloc could get 2 tag bits, the main
program 3 and some other library 1 bit?

For glibc malloc, it would be a simple enhancement to move the
IS_MMAPPED to a tag bit, and eliminate the malloc header for mmap'ed
chunks, replacing it with a separate data structure.  This would allow
us to preserve page alignment for mmap'ed chunks without wasting an
entire page for each allocation, just to store the malloc header.

> +@deftypefun {void *} tag_address (void *@var{addr}, unsigned int @var{tag})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Return the address of @var{addr} with the tag value @var{tag} stored
> +in the untranslated bits.  Overflow of @var{tag} in the untranslated
> +bits are ignored.
> +@end deftypefun
> +
> +@deftypefun {void *} untag_address (void *@var{addr})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Return the address of @var{addr} with all zero untranslated bits.
> +@end deftypefun

This should reference the earlier discussion about when it is safe to
tag and untag addresses.

> +@deftypefn Macro int TAGGED_ADDRESS_VALID_BITS (@var{bits})
> +This macro returns a nonzero value (true) if @var{bits} a valid tagged
> +address bits.
> +@end deftypefn

“are valid tagged”?

Does “valid” mean in this context that “the CPU can be configured to
ignore bits set in BITS during address translation using
set_tagged_address_mask”?

> +@deftypefn Macro {const uintptr_t} TAGGED_ADDRESS_MASK (@var{bits})
> +This macro returns a nonzero value if it can be used as mask for constant
> +address @var{bits} used in address translation.
> +@end deftypefn

I do not understand the description.

Thanks,
Florian



More information about the Libc-alpha mailing list