[PATCH v3 07/24] Documentation for memory tagging remote packets

Tue Nov 17 14:44:16 GMT 2020

On 11/17/20 9:29 AM, David Spickett wrote:
>> Right. The type is really telling you what specific kind of tag you are
>> requesting, not the technology. So it may be perfectly valid to request
>> a MTE logical tag from the remote target, but the remote target doesn't
>> know how to reply to that at the moment (nor does it make much sense, IMO).
>>
>> The tag types don't overlap at the moment, given they are an ENUM in
>> generic code. So the server will tell them apart by their values.
> 
> Ok but my concern here is that there are two aspects to type. (which I
> probably confused earlier tbf)
> 
> 1. logical vs allocation
> 2. MTE vs <future tagging technology>
> 
> So we have three scenarios:
> 1. Server uses type to decide between logical and allocation

Right now 0 maps to logical and 1 maps to allocation for all 
architectures. For AArch64 0 maps to MTE logical and 1 maps to MTE 
allocation.

But, if more tag types are supported in the future, we will need to map 
types to logical, allocation or something else.

> 2. Server uses type to decide between MTE and <future tagging technology>

Right now 0 and 1 map to MTE for AArch64 and ADI for SPARC (not yet 
supported). In the future, if there are more tagging technologies, we 
will need to provide a mapping from tag types to tag technology.

> 3. The combination of the two, should a system have both technologies

In this case, we will use two mappings: type to tag type and type to tag 
technology.

> 
> What I want to avoid is a two step process:
> 1. Somehow select tagging technology you want to interact with
> 2. You send memtag packets with the logical vs allocation "type"
> 
> Which would be needed if the protocol "type" is just allocation vs
> logical. If the "type" includes the technology
> then we can do it in one step.

Right now you can infer the technology and logical/allocation from the 
type. There is no need for a two step process. I don't anticipate a need 
for that in the future either, as long as we prevent the tag type ENUM 
from having overlapping values.

> 
> So I prefer:
> 0 = mte logical, 1 = mte allocation, 1 = future logical, 2 = future
> allocation, 3 = future tag form (not logical/allocation) etc...

So this is wrong, as the remote/native side won't be able to tell both 
definitions of type "1" apart.

> Over:
> 0 = logical, 1 = allocation, 2 = future term for other tag form etc...

This is correct in my view. Everything is unique.

The complicating factor here is that generic code/UI needs to use a 
generic tag type so all architectures supporting memory tagging can use 
the commands out of the box.

For AArch64, tag_logical/tag_allocation means MTE logical/MTE 
allocation. For SPARC ADI, this means whatever their tagging mechanism is.

If we turn tag_logical into mte_logical, SPARC ADI won't be able to use 
that. We will, again, need a mapping.

If we want to address this and make it a bit more flexible, then we will 
need to let architectures define their own tag type values. Then generic 
code will have to go through an arch-specific hook to fetch the tag type.

Would that address your concerns?

> 
> I see the need for the second form in gdb internally, but my concern
> is only with the protocol side.

As long as GDB sends a unique tag type identifier (non-overlapping 
values), we will be able to tell them apart from the remote side.

> 
>> I just want to keep that option open
>> if someone wants to do it, or if some other type of tag shows up that
>> would require such support.
> 
> I hadn't thought of that. I agree that type should include that too.
> 
> On Tue, 17 Nov 2020 at 12:01, Luis Machado <luis.machado@linaro.org> wrote:
>>
>> Hi,
>>
>> On 11/17/20 7:05 AM, David Spickett wrote:
>>>> Right now the design makes these types architecture-specific.
>>>
>>> This works too, in fact it matches the breakpoint types example better that way.
>>>
>>>> But there's one catch right now. The user-visible commands know about
>>>> two types of tags (logical and allocation). The native/remote side of
>>>> GDB only sees one type, the allocation one, as it doesn't make sense to
>>>> ask the native/remote target about logical tags.
>>>>
>>>> This is slightly messy and, in my opinion, should be an implementation
>>>> detail.
>>>
>>> Tell me if I have this right.
>>>
>>> In gdb in overall you have these two types but the server only uses
>>> one of them, the allocation tag type.
>>> So only the allocation tag type will ever go over the protocol. (for
>>> MTE at least)
>>
>> That's correct. Only GDB knows about logical tags. Those don't make
>> their way to the remote via the remote protocol.
>>
>>>
>>> Given that, if we assume that "mte allocation" type is 1. A future
>>> AArch64 kind of memory tagging could allocate 2 and on for its tag
>>> types.
>>> Something like:
>>> AArch64 Memory Tag types -
>>>     0 : MTE logical (which is internal only, reserved but documented as
>>> unused, or left out completely?)
>>>     1 : MTE allocation (the one we use at present)
>>>     2: <future tagging> logical tag (because maybe there is some server
>>> component for this kind of tagging extension?)
>>>     3: <future tagging> allocation tag
>>>
>>> The reason I want to clarify is that I understood the type to
>>> differentiate tagging technologies, not the kind of tag within them.
>>> (the type tells you MTE vs <future tag type> instead of allocation vs logical)
>>> The use case being what if you have MTE and <future tag type> active
>>> in the same target and I want to set an MTE allocation tag,
>>> how can the server tell them apart?
>>
>> Right. The type is really telling you what specific kind of tag you are
>> requesting, not the technology. So it may be perfectly valid to request
>> a MTE logical tag from the remote target, but the remote target doesn't
>> know how to reply to that at the moment (nor does it make much sense, IMO).
>>
>> The tag types don't overlap at the moment, given they are an ENUM in
>> generic code. So the server will tell them apart by their values.
>>
>>>
>>> If the type numbers overlap between tagging technologies, we can't
>>> tell them apart.
>>> However if they encode what extension they are for and the
>>> logical/allocation type (as in the example above) then we can.
>>>
>>> A lot of that is probably academic given there's one relevant type but
>>> we can at least document the intent of the field.
>>> E.g. "these types are global to AArch64 so new types should not
>>> overlap existing ones"
>>
>> I suppose. I chose to have generic ENUM's without specific references to
>> MTE for that reason. Architectures can use logical/allocation tags as
>> they see fit, but the ENUM values will not overlap.
>>
>> We need to support the UI as well, so there needs to be some generic
>> definitions so commands can query different tag types.
>>
>>>
>>>> Otherwise we'd need to standardize on particular tag type names across
>>>> different architectures, like "hw memory tag", "sw memory tag",
>>>> "capability tag" etc.
>>>
>>> Well I was thinking of type more as a single value like "mte". Anyway
>>> I'm fine with the integer route.
>>
>> Though we don't have a use for requesting logical tags from the remote
>> targets, it is possible to support that. Passing "mte" or any other
>> technology name would close that option.
>>
>> If, for example, we decide to have a dumb GDB client and a smart
>> GDBserver (unlikely at this point), then it would make sense to pass
>> down logical tag requests I think. I just want to keep that option open
>> if someone wants to do it, or if some other type of tag shows up that
>> would require such support.
>>
>>>
>>> On Mon, 16 Nov 2020 at 17:23, Luis Machado <luis.machado@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 11/16/20 1:04 PM, David Spickett wrote:
>>>>> Also with regard to the "type" field.
>>>>>
>>>>>> +@var{type} is the type of tag the request wants to fetch.  The typeis a signed
>>>>>> +integer.
>>>>>
>>>>> (typo aside) Is this field architecture specific and will there be a
>>>>> list of these type numbers documented anywhere? (or already is)
>>>>> For example would 1 on AArch64 be MTE, and on <other arch> be <other
>>>>> tag type>. Or would that <other tag type> be 2.
>>>>>
>>>>> My assumption has been that it is the latter and that a value means a
>>>>> kind of tagging extension. So for example 1=MTE rather than
>>>>> 1= mte logical and 2 = mte allocation. Correct me if I am wrong there.
>>>>
>>>> Right now the design makes these types architecture-specific. It would
>>>> be nice to have more documentation about them, for sure.
>>>>
>>>> But there's one catch right now. The user-visible commands know about
>>>> two types of tags (logical and allocation). The native/remote side of
>>>> GDB only sees one type, the allocation one, as it doesn't make sense to
>>>> ask the native/remote target about logical tags.
>>>>
>>>> This is slightly messy and, in my opinion, should be an implementation
>>>> detail.
>>>>
>>>> So, in summary... We have a couple generic tag types GDB knows about:
>>>> logical and allocation.
>>>>
>>>> Those types get translated to an arch/a target-specific type when they
>>>> cross the native/remote target boundary.
>>>>
>>>> In theory we could have generic tag types 1 and 2 in generic code, but
>>>> tag type 2 gets translated to type 1 in a remote packet.
>>>>
>>>> Maybe we could improve this a little.
>>>>
>>>>>
>>>>> A page like:
>>>>> https://sourceware.org/gdb/current/onlinedocs/gdb/ARM-Breakpoint-Kinds.html#ARM-Breakpoint-Kinds
>>>>>
>>>>> Or just a short note, given that there's only one type right now.
>>>>
>>>> Yes, that would be nice to expand for the tag types.
>>>>
>>>>>
>>>>> Also, I may have suggested the type be a string at some point. However
>>>>> based on examples like the link above
>>>>> I don't see much advantage to it apart from making packet dumps easier
>>>>> to read. Just wanted to close the loop on that
>>>>> if I didn't before.
>>>>
>>>> I don't have a strong preference here. I'm just forwarding the tag type
>>>> from generic code.
>>>>
>>>> If we want to pass strings, we will need a gdbarch hook that maps a type
>>>> to a string in the remote target layer.
>>>>
>>>> Otherwise we'd need to standardize on particular tag type names across
>>>> different architectures, like "hw memory tag", "sw memory tag",
>>>> "capability tag" etc.
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 16 Nov 2020 at 15:44, David Spickett <david.spickett@linaro.org> wrote:
>>>>>>
>>>>>> Minor thing, there is a missing space here in "typeis".
>>>>>>
>>>>>>> +@var{type} is the type of tag the request wants to fetch.  The typeis a signed
>>>>>>> +integer.
>>>>>>
>>>>>> On Mon, 9 Nov 2020 at 17:08, Eli Zaretskii <eliz@gnu.org> wrote:
>>>>>>>
>>>>>>>> Date: Mon,  9 Nov 2020 14:04:18 -0300
>>>>>>>> From: Luis Machado via Gdb-patches <gdb-patches@sourceware.org>
>>>>>>>> Cc: david.spickett@linaro.org
>>>>>>>>
>>>>>>>> gdb/doc/ChangeLog:
>>>>>>>>
>>>>>>>> YYYY-MM-DD  Luis Machado  <luis.machado@linaro.org>
>>>>>>>>
>>>>>>>>          * gdb.texinfo (General Query Packets): Document qMemTags and
>>>>>>>>          QMemTags.  Document the "memory-tagging" feature.
>>>>>>>> ---
>>>>>>>>     gdb/doc/gdb.texinfo | 96 +++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>     1 file changed, 96 insertions(+)
>>>>>>>
>>>>>>> OK for this part, thanks.