[PATCH 0/7] RFC Memory tagging support
Richard Earnshaw
Richard.Earnshaw@foss.arm.com
Tue Jun 16 11:07:39 GMT 2020
On 16/06/2020 11:31, Szabolcs Nagy wrote:
> The 06/16/2020 11:17, Richard Earnshaw wrote:
>> On 15/06/2020 18:09, Richard Earnshaw wrote:
>>> On 15/06/2020 18:04, DJ Delorie via Libc-alpha wrote:
>>>>
>>>> Two immediate thoughts...
>>>>
>>>> 1. Do we really want to add more environment variables as aliases for
>>>> new tunables? I thought env support was for pre-tunable variable
>>>> support (compatibility) only.
>>>
>>> That might depend on whether we want to try to share how this is enabled
>>> with other C libraries - we can't expect them to copy all of glibcs
>>> tunable API here.
>>>
>>> That being said, this is easy enough to change if needed.
>>>
>>>>
>>>> 2. Do we really need to lose the back pointer's word in allocated
>>>> memory? Historically, the back pointer is *not* part of the malloc
>>>> internal data when the chunk is in 'allocated' state, and losing that
>>>> memory will make small allocations much less efficient.
>>>>
>>>
>>> Yes, if you want to protect the back pointer against being trampled by
>>> programs - it has to have a different tag colour to memory given to the
>>> application.
>>>
>>> R.
>>>
>>
>> Your second comment made me go back and look again at the assumptions
>> I've made. I'm pretty sure they hold.
>>
>> Taking the comment from the malloc code (the labels on the right are
>> mine to clarify the following text)...
>>
>>
>> An allocated chunk looks like this:
>>
>>
>> chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | Size of previous chunk, if unallocated (P clear) | (1)
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | Size of chunk, in bytes |A|M|P| (2)
>> mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | User data starts here... . (3)
>> . .
>> . (malloc_usable_size() bytes) .
>> . |
>> nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | (size of chunk, but used for application data) | (4)
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | Size of next chunk, in bytes |A|0|1| (5)
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>
>> Now, when we map MTE onto that we have to look at the granules, which
>> are the minimal units of memory that must have a single colour. In
>> aarch64 that's a 16 byte chunk and maps quite happily onto the chunk
>> header. So, (assuming I've understood all this correctly :) the
>> chunk header (labels 1 & 2 and 4 & 5 above) is 16 bytes long and 16-byte
>> aligned - perfect! But what we can't do is allow the 8 bytes at the
>> start of nextchunk (4) to be used in the previous allocation block,
>> since to do that we'd have to assign it the colour of the user data (3)
>> and it can't have that colour unless the next chunk size (5) also has
>> that colour - and if we did that then malloc's own data structures would
>> no-longer be coloured differently to the user data.
>
> it is also possible to always get the right tag
> when accessing data in user allocation. e.g.
> instead of
>
> size = *p;
>
> use
>
> size = *get_correctly_tagged_ptr(p);
>
> but this is a bit akward and can be unsafe (the
> meta data inside user allocation is not protected
> via tagging) and does not work in case tags can
> change concurrently (there is no 'unchecked load'
> in the architecture only separate 'get correct tag'
> and 'checked load') e.g. a free running concurrently
> with an operation on the next chunk that for some
> reason looks at this field of the prev chunk. or if
> we want to allow users to retag their allocated
> memory (sub allocators).
>
I currently haven't exported any of the memory tagging operations from
glibc (the new functions are internal only for now). That's a
deliberate design choice at this time to avoid exposing an API until we
are confident that such operations are correct and desirable.
In terms of colouring the meta-data with the user's colour - it might be
possible, but it might have race issues as you say, and it would
certainly weaken the protection given by MTE, since a small buffer
overrun would corrupt the meta-data and might go undetected. It's a
trade-off between protection and efficiency.
R.
>>
>> A complete rewrite of malloc to use an out-of-band chunk list would
>> probably address the wastage, but I really wanted to avoid that... :)
>>
>> R.
>
More information about the Libc-alpha
mailing list