This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH] AArch64 pauth: Indicate unmasked addresses in backtrace
- From: Alan Hayward <Alan dot Hayward at arm dot com>
- To: Pedro Alves <palves at redhat dot com>
- Cc: Simon Marchi <simon dot marchi at polymtl dot ca>, "gdb-patches\\@sourceware.org" <gdb-patches at sourceware dot org>, nd <nd at arm dot com>
- Date: Wed, 17 Jul 2019 16:07:00 +0000
- Subject: Re: [PATCH] AArch64 pauth: Indicate unmasked addresses in backtrace
- Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=arm.com;dmarc=pass action=none header.from=arm.com;dkim=pass header.d=arm.com;arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dCLA/JvBozWNEbq192CHk28s8MurWMSe50kysWBonRk=; b=HzzrS8DGTPahuPePUW/qtCI+Npr5DATK9Rx+bLnBxMsoawYSSDjQdUdtd2Cf5As3q8JlzZqMG4tG5ETg04UVIhreqiougvtTUOpltd4qnroD3VMxHSLrV+NFtjVRu1VouzTbyo8RJsFczJ4TKc+ePYb8mCdnbU3Zb6DXDPt64nRIMvwUZL/vlYGLxXW8ZQsmva3cDA6axIu0Yb4lg9IlIXwQH6lxUrU3l8Ql3vVAcz88pmrj1RvuVPfzRVLqcjnfoblZAYlv5kvYTlwUvw6cflhF7xVGhwT1fgIf+3TZrUk12iuCdH2KO2s0s4kNSUKl/t67a5sNDQj+lB6+btJfzA==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=a918eiHxYyc8xSvK6pCOwZleZo45aDfycCY/5HPLzoZQqv1iEQvQSki3QHx/ZqDxDMo5HyeVunFhBBRgolHKALGIjjNPQwRL7cIdreMEHQLjbxuSIL0/W+jxDJFxThtKR1/lBp8zDteZtbUTxGhe71YREN95C3Og3z1Mn5hIi8/qgf3S8kLvUJWgdl2Jwx0PNBkjXknu4FHXat8kVOs62duMBkgdhVvnyV+UqlbaWjCbYhp2S/SJYDeaLhpX2Cz06AimTRkTW3AhqQvlpiOOtt6rsdsrzrEiAJi7/ZaQ0gBaN9vubIM+MB6SjBYrZvK/v3QZoY/VSmWwjMdy9bbpKw==
- Original-authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alan dot Hayward at arm dot com;
- References: <20190717081336.68835-1-alan.hayward@arm.com> <dfb2cbaf-f3a6-f0b3-2080-44bba21d77d3@redhat.com> <68E9D3EF-D6A5-44C9-A87C-916EC6970435@arm.com> <e5c0663d5eda768e30cea646c3b37726@polymtl.ca> <9c474f28-30f3-2428-d147-4474471a61ba@redhat.com>
> On 17 Jul 2019, at 16:18, Pedro Alves <palves@redhat.com> wrote:
>
> On 7/17/19 4:01 PM, Simon Marchi wrote:
>> On 2019-07-17 09:35, Alan Hayward wrote:
>>>> On 17 Jul 2019, at 12:15, Pedro Alves <palves@redhat.com> wrote:
>>>>
>>>> On 7/17/19 9:14 AM, Alan Hayward wrote:
>>>>> Armv8.3-a Pointer Authentication causes the function return address to be
>>>>> obfuscated on entry to some functions. GDB must unmask the link register in
>>>>> order to produce a backtrace.
>>>>>
>>>>> The following patch adds markers of <unmasked> to the bracktrace, to indicate
>>>>> which addresses needed unmasking.
>>>>>
>>>>> For example, consider the following backtrace:
>>>>>
>>>>> (gdb) bt
>>>>> 0 0x0000000000400490 in puts@plt ()
>>>>> 1 0x00000000004005dc in foo ("hello") at cbreak-lib.c:6
>>>>> 2 0x0000000000400604<unmasked> in bar () at cbreak-lib.c:12
>>>>> 3 0x0000000000400620<unmasked> in barbar () at cbreak.c:17
>>>>> 4 0x00000000004005b4 in main () at cbreak-3.c:10
>>>>>
>>>>> The functions in the cbreak-lib use pointer auth, obfuscating the return address
>>>>> to the previous function. The caused the addresses of bar and barbar to require
>>>>> unmasking in order to unwind the backtrace.
>>>>>
>>>>> Alternatively, I considered replacing <unmasked> with a single chracter, such
>>>>> as * for brevity reasons, but felt this would be non obvious for the user.
>>>>
>>>> I don't have a particular suggestion, though my first reaction was that
>>>> it seemed a bit verbose.
>>>>
>>>> IMHO, the marker doesn't have to stand out and be expressive, since users can
>>>> always look at the manual.
>>>
>>> Reading the manual is an assumption I’m not sure is anywhere near the
>>> common case.
>>> Saying that, I agree we shouldn’t be designing the output for the non-readers.
>>>
>>> This comment has reminded me I need to add something to the manual as
>>> part of this
>>> patch.
>>>
>>>
>>>> Once they learn something, often being concise
>>>> helps -- or in other words, once you learn what "<unmasked>" or "U" or whatever
>>>> is, and you're used to it, what would you rather see? What's the main
>>>> information you're looking for when staring at the backtrace? Thoughts
>>>> like that should guide the output too, IMO.
>>>
>>> PAC is the official abbreviation for the feature, so maybe :PAC works best.
>>>
>>> (gdb) bt
>>> 0 0x0000000000400490 in puts@plt ()
>>> 1 0x00000000004005dc in foo ("hello") at cbreak-lib.c:6
>>> 2 0x0000000000400604:PAC in bar () at cbreak-lib.c:12
>>> 3 0x0000000000400620:PAC in barbar () at cbreak.c:17
>>> 4 0x00000000004005b4 in main () at cbreak-3.c:10
>>>
>>>
>>> Some of my attempts at different representations:
>>> 2 0x0000000000400604* in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604! in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604U in bar () at cbreak-lib.c:122
>>> 2 0x0000000000400604:U in bar () at cbreak-lib.c:122
>>> 2 0x0000000000400604<U> in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604[U] in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604<M> in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604<P> in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604<PAC> in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604PAC in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604:PAC in bar () at cbreak-lib.c:12
>>> 2 0x0000000000400604,PAC in bar () at cbreak-lib.c:12
>>>
>>> I found a single character was too hidden. A single character or symbol was also
>>> a little confusing - my brain read U as unsigned, * as pointer, [] as an array.
>>>
>>> I also like ,PAC as it might be easier to add future extensions.
>>
>> It might not be easily doable, but I think it would be nice if you could somehow make it so the function names stay aligned (regardless of which marker you end up choosing), like:
>>
>> 0 0x0000000000400490 in puts@plt ()
>> 1 0x00000000004005dc in foo ("hello") at cbreak-lib.c:6
>> 2 0x0000000000400604 [U] in bar () at cbreak-lib.c:12
>> 3 0x0000000000400620 [U] in barbar () at cbreak.c:17
>> 4 0x00000000004005b4 in main () at cbreak-3.c:10
>
> I almost suggested the same, but didn't when I realized that we
> don't always print the addresses:
>
> (top-gdb) bt
> #0 gdb_main (args=0x7fffffffd3a0) at src/gdb/main.c:1186
> #1 0x0000000000469a7e in main (argc=1, argv=0x7fffffffd4a8) at src/gdb/gdb.c:32
>
What’s the reason for that? Surely we always know the address of a function
in the backtrace? Can it happen in the middle of a backtrace?
> But if you do want to align the addresses, you could do that by
> specifying a width for the "addr" column.
> If "[U]" is rare, given no column
> headers, the spaces may look a bit odd, though.
In general, it depends how a binary/library was compiled. But I’d expect a binary
to either have it in most functions or none.
Should be easy enough to remove the extra spaces if the system doesn’t support PAC.
> Maybe you'd want to pre-compute
> the max column width by looking at the max number of frames that fit on a
> page, or something along those lines.
>
hmmm... ok. I’ll see what I can do there.
> On 17 Jul 2019, at 15:43, Pedro Alves <palves@redhat.com> wrote:
<SNIP>
>
> I'd go with either:
>
> 2 0x0000000000400604 (PAC) in bar () at cbreak-lib.c:12
> 2 0x0000000000400604 [PAC] in bar () at cbreak-lib.c:12
>
> Not having the space may make it a little bit harder
> to focus on low digits of the address.
>
>> my brain read U as unsigned, * as pointer, [] as an array.
>
> If you make it like 0x0000000000400604U, then I can see that.
>
> But not so much with:
>
> 2 0x0000000000400604 [U] in bar () at cbreak-lib.c:12
>
> You don't have to use a single letter, though:
>
> 2 0x0000000000400604 [UN] in bar () at cbreak-lib.c:12
>
> [] seems natural as a way to group some flags/properties to me.
>
> We already use it here for example:
>
> (top-gdb) info registers $eflags
> eflags 0x206 [ PF IF ]
>
>
> I guess I'm saying that it depends on context, and I wouldn't
> be worried with [] being confused with C arrays. Afterall,
> < and > also have meaning in C/C++... More than one meaning,
> actually. :-)
>
The extra space really does help there.
Given PAC really is an AArch64 thing (as opposed to something more
generic like Unmasked) might be worth adding a
gdbarch_print_function_address () or something like that so that I
can override it in aarch64. Assuming it fits with all the width
calculations.
Alan.