[PATCH v3 21/24] Extend "x" and "print" commands to support memory tagging

Mon Jan 18 20:20:47 GMT 2021

On 1/18/21 2:56 PM, Simon Marchi wrote:
> 
> 
> On 2020-12-29 1:50 p.m., Luis Machado wrote:
>> The switch that controls this is the one that enables/disables memory tagging. It controls both the print command behavior and the "x" command behavior.
>>
>> Do you think we should have a command for each individual command?
> 
> Hmm well I'm comparing it to the other "set print" options.  If you
> add an option (say, "memory-tag-violations" to the
> value_print_option_defs array, you get both
> 
>    set print memory-tag-violations on/off
> 
> and
> 
>    print -memory-tag-violations on/off
> 
> options for free.
> 
> And now that I think about it a bit more, I am not sure why the
> "set memory-tagging" exists (at least as of today).  It controls
> whether you can use the "memory-tag" commands, but I don't really
> see why that's useful.  If you don't want to use these commands,
> you can just not use them not need for them to be disabled.
> 
> Otherwise, the memory_tagging global is only used to control
> whether "print" and "x" should talk about memory tags.  With
> my suggestion above, print already has its -memory-tag-violations
> flag and "set print memory-tag-violations" option.  And the
> "x" command already has its /m option that you implemented.
> So I don't see a need for the "set memory-tagging" option.
> 
> If tracking memory tags required maintaining a state as the
> program runs, then it would make sense to have a
> "set memory-tagging on/off" option to enable/disable that
> feature.
> 
> And I think that printing memory tags and printing memory tag
> violations are two separate things, so we should be precise in
> the option names.  We might later want to implement a "print"
> option to print the logical tags every time a pointer value is
> printed (a bit like symbols are printed when the pointer points
> to a symbol).  That could then be controlled by
> 
>    set print memory-logical-tags
>    print -memory-logical-tags
> 
> By naming the new print option "memory-tag-violations" (rather
> than say "memory-tags"), we leave the door open to things like
> that.

Thinking about this, it makes sense. If a target does not implement the 
memory tagging commands, then GDB will display an error stating so.

I can update v4 to follow this suggestion.

> 
> Random usage questions:
> 
> If I do
> 
>    (gdb) print myptr
> 
> and myptr has the wrong logical tag for the pointed memory area,
> it will print that there is a memory tag violation.  But will it
> also print it if myptr is dereferenced in an expression, like
> this?
> 
>    (gdb) print *myptr + 2
> 
> I guess not, as this is only checked when the final value is
> printed.  Is this something we might want in the future?

Right. GDB first evaluates the expression into a result (struct value 
*), then checks if the result is a pointer. If that is the case, then it 
proceeds to do the check.

We don't check each individual element in an expression. We could do so 
in the future, of course. But, given the most common use case we're 
trying to cover, I don't think the extra complexity and verbosity would 
be worth it.

The most common use case would be a faulty program that is running into 
a SIGSEGV due to a tag violation. Then the developer can use GDB to 
pinpoint where that violation is coming from, why it is happening and 
what the right tag should be for that particular case.

Checking every individual element of an expression (tag-wise) would be a 
bit too much. And the default for this feature most likely would need to 
be "off" so we don't incur in unnecessary overhead.

The answer to the question "is this particular pointer tagged?" is 
already a bit expensive to answer, given we need to go through the 
/proc/<pid>/smaps file and parse each memory map entry.

> 
> What happens for arrays?  If the granule size is 16 bytes, but you
> have an array of 32 bytes, does the array have two different
> allocation tags?  But then the pointer (that points to the beginning
> of the array) can only contain a single logical tag.  What happens
> when you use the pointer to access the various areas of the array?

Compiler-wise, I'm pretty sure a contiguous array of 32 bytes would have 
the same tag across its 2 granules. Of course you could create an area 
with 2 granules containing different tags, and then try to access those 
as an array of 32 bytes through some pointer.

In GDB, "p array[0]" wouldn't cause warnings about tag mismatches 
because you're evaluating the expression and getting, say, an integer 
back. And integers don't hold logical tags.

If you issue "p array" instead, then GDB will attempt to validate the 
tag for you.

(gdb) ptype a
type = unsigned char *
(gdb) memory-tag print-allocation-tag &a[0]
$3 = 0x0
(gdb) memory-tag print-allocation-tag &a[16]
$4 = 0x0
(gdb) p a
Logical tag (0x8) does not match the allocation tag (0x0).
$1 = (unsigned char *) 0x800fffff7ffa000 "\001\002"
(gdb) p a[0]
$5 = 1 '\001'
(gdb) p a[16]
$6 = 0 '\000'
(gdb) p &a[16]
Logical tag (0x8) does not match the allocation tag (0x0).
$7 = (unsigned char *) 0x800fffff7ffa010 ""

> 
> Simon
>