[PATCH v5 1/1] <sys/tagged-address.h>: An API for tagged address

H.J. Lu hjl.tools@gmail.com
Sat Aug 21 16:37:55 GMT 2021


On Sat, Aug 21, 2021 at 9:10 AM Sunil Pandey <skpgkp2@gmail.com> wrote:
>
>
>
> On Fri, Aug 20, 2021 at 4:27 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Thu, Aug 12, 2021 at 1:36 AM Florian Weimer <fweimer@redhat.com> wrote:
>> >
>> > * H. J. Lu:
>> >
>> > > diff --git a/manual/ctype.texi b/manual/ctype.texi
>> > > index d0618c5c38..28af73ff0e 100644
>> > > --- a/manual/ctype.texi
>> > > +++ b/manual/ctype.texi
>> > > @@ -1,4 +1,4 @@
>> > > -@node Character Handling, String and Array Utilities, Memory, Top
>> > > +@node Character Handling, String and Array Utilities, Tagged Address, Top
>> > >  @c %MENU% Character testing and conversion functions
>> > >  @chapter Character Handling
>> >
>> > Allegedly, it should not be necessary to maintain the @node linkage
>> > manually (if we remove it).
>>
>> Since @node isn't removed, I guess I have to keep this change.
>>
>> > > diff --git a/manual/tagged-address.texi b/manual/tagged-address.texi
>> > > new file mode 100644
>> > > index 0000000000..a3929a0eb7
>> > > --- /dev/null
>> > > +++ b/manual/tagged-address.texi
>> > > @@ -0,0 +1,80 @@
>> > > +@node Tagged Address, Character Handling, Memory, Top
>> > > +@c %MENU% Tagged address functions and macros
>> > > +@chapter Tagged Address
>> > > +
>> > > +By default, the number of the address bits used in address translation
>> > > +is the number of address bits.  But it can be changed by ARM Top-byte
>> > > +Ignore (TBI) or Intel Linear Address Masking (LAM).
>> >
>> > Current spelling is “Arm”, I think.
>>
>> There are
>>
>> contrib.texi:Philip Blundell for the ports to Linux/ARM
>> contrib.texi:(@code{arm-@var{ANYTHING}-linuxaout}) and ARM standalone
>> contrib.texi:Richard Earnshaw for continued support and fixes to the various ARM
>> contrib.texi:his maintainership of the ARM and MIPS architectures and the math
>> contrib.texi:encryption support for ARM and various fixes.
>> creature.texi:architectures (i686, ARM), this is planned to change and
>> applications
>>
>> and
>>
>> contrib.texi:Ulrich Weigand for various fixes to the PowerPC64 and Arm ports.
>>
>> I will keep ARM.
>>
>> > > +@Theglibc{} provides several functions and macros in the header file
>> > > +@file{sys/tagged-address.h} to manipulate tagged address bits, which is
>> > > +the number of the address bits used in address translation, with
>> > > +restrictions:
>> >
>> > Aren't the tagged address bits *not* used in address translation?
>> >
>> > The Arm documentation
>> >
>> >   <https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html>
>> >
>> > implies that the tag bits are those that are ignored for address
>> > translation purposes (e.g., “The syscall behaviour for a valid tagged
>> > pointer is the same as for the corresponding untagged pointer.”).  This
>> > manual change uses the reverse terminology: tagged address bits are
>> > those that are used in address translation (including the untranslated
>> > intra-page offset).
>> >
>> > I find it more intuitive to refer to the ignored bits as “tagged address
>> > bits”.
>>
>> I will rename it to set_translated_address_mask.
>
> Existing name set_tagged_address_mask may not be intuitive but it looks more clear to me than the proposed new name(set_translated_address_mask). For the proposed new name, it is not clear which particular portion of address will be masked.
> How about set_address_tagbit_mask?

Can you give it an example of how set_address_tagbit_mask should
be used?

>>
>>
>> > > +@deftypefun int set_tagged_address_mask (uintptr_t @var{mask})
>> > > +@standards{GNU, sys/tagged-address.h}
>> > > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>> > > +Set the mask for address bits used in address translation to @var{mask}.
>> > > +Only bits set in @var{mask} will be used in address translation.  The
>> > > +return value is @code{0} on success and @code{-1} on failure.  This
>> > > +function can be called only once before @code{main}.
>> >
>> > Again the restriction around @code{main} is unclear.  If it's “before
>> > allocating memory” or “before starting threads”, than we should say
>> > that.
>>
>> I will change it to
>>
>> This function can be called only once before @code{main} and thread creation.
>>
>> > I still don't see a way how we can split tag address bits used by the
>> > implementation (glibc, sanitizers) and the application.
>> >
>> > For example, glibc could use a tag bit to indicate whether an allocation
>> > is in a mmap-based allocation.  This way, we could use an out-of-line
>> > object header (found via a hash table, for example), and utilize the
>> > fact that mmap-based allocations are always page-aligned.  This would no
>> > core malloc algorithm changes and should be an obvious improvement.
>> > With more substantial changes, we could use another bit to encode that
>> > an allocation is in a small objects region and does not have an
>> > immediately preceding object header, either.  Introducing small object
>> > regions is a much larger change, though.
>> >
>>
>> Tag usage should be exclusive to glibc.  set_translated_address_mask
>> tells glibc that tag will be used for another purpose.
>>
>> Thanks.
>>
>> --
>> H.J.



-- 
H.J.


More information about the Libc-alpha mailing list