[PATCH v2] Implement pahole-like 'ptype /o' option

Sergio Durigan Junior sergiodj@redhat.com
Tue Dec 12 00:25:00 GMT 2017

On Monday, December 11 2017, Pedro Alves wrote:

> On 12/11/2017 10:50 PM, Sergio Durigan Junior wrote:
>>>> Fair enough.  I use this trick for function prototypes, but not for the
>>>> definitions.
>>> Simon's format is what I've been using for a long while too.
>> Well, I could post a few examples of the format I chose, but I'm pretty
>> sure this would be worthless.  As I said, I will change my code.
> By stating the format that I've been using too, my intention
> is to show that Simon's format isn't his own unique snowflake
> either.

I honestly never thought he was the only one doing this.  I certainly
remember having seen this format choice in other places as well.

> If there are many examples of the format you chose, I've not seem
> many.  Maybe it depends on area of code one is touching, or IOW, on
> the preferences of who wrote specific areas of the codebase.

I remember seeing some examples of the format I chose.

> IMO, the right way to go about this is to decide on a format,
> document it in the wiki and then we can point everyone at it with a URL.
> (ISTR that GCC's coding conventions documents yet another
> way to break the line, and in that case, not even GCC follows it.)
> IMO, the format you've chosen isn't ideal because it requires
> manual right-alignment for every line, while breaking before the
> parens makes emacs's TAB automatically align the following lines.
> The latter also gives a lot more space for the params again; it's
> sort of a "well, I have to do something, so might as well start
> way back from the left again).

No, it's not ideal at all.  But at least on my Emacs, the other format
has the downside of being always realigned to the column zero when I
press TAB on it.  But again, I'm not saying this is Emacs fault nor your
nor Simon's fault; this is just something I noticed.

>>>>> But I don't mind it, it just stuck out as a little inconsistency.
>>>> I don't see the inconsistency.
>>>> If a field is inside a struct, it has its offset *and* size printed.  No
>>>> matter if the field is an int, another struct, or an union.
>>>> If a field is inside an union, it has only its size printed.
>>>> In the case above, it makes sense to have the offsets printed for the
>>>> fields inside the two structs (inside the union), because there might be
>>>> holes to report (well, one can argue that it doesn't matter if there are
>>>> holes or not in this case, because if the other struct is bigger then
>>>> the union size will stay the same).  However, it doesn't make sense to
>>>> print the offsets for the two structs themselves, because they are
>>>> members of the union.
>>>> I hope it makes more sense now.
>>> But why do we need the special case?  Does it help anything?
>>> So far, it seems it only added confusion.
>> What do you mean by "special case"?
> Special case:
>   "If a field is inside a union, it has only its size printed; otherwise
>    print the offset and size." 
> No special case:
>   "Always print the offset and size."

I don't like the expression "special case" because it diminishes the
difference that exist between structs and unions.  It is not like I went
out of my way to treat this difference and made the code complex; it is
also not like the output is extremely complex with it.

>> This is what pahole does, and as I've said a few times, the output of
>> 'ptype /o' has been based on pahole's output.  
> That's abundantly clear, but I don't see why we need to follow
> "pahole"'s output religiously...

Sure, but I don't see why we should deviate from the de facto convention
in this specific case.

>> I don't consider this a
>> special case; I consider it to be the natural thing to do, because
>> offsets don't make much sense in unions.
> Of course they do.  You can do 'offsetof(foo_union, foo_field)' just
> fine, for example.  Saying that the offsets happen to be the same
> is not the same as saying that the offsets don't exist.

I don't remember saying offsets don't exist in unions.  What I said is
that in this specific case they don't matter/make much sense to be

> I asked:
>>> Does it help anything?
>>> So far, it seems it only added confusion.
> I think a much better rationale for omitting the offsets
> would be:
> Not printing the offsets in the case of union members helps
> by increasing signal/noise ratio.

I think that's a matter of opinion.  I didn't think about reducing the
signal/noise ratio; I explicitly thought that printing union offsets was
not useful.

> Why?  
> With patch as is:
> /*    0      |    32 */    struct general_symbol_info {
> /*    0      |     8 */        const char *name;
> /*    8      |     8 */        union {
> /*                 8 */            LONGEST ivalue;
> /*                 8 */            const block *block;
> /*                 8 */            const gdb_byte *bytes;
> /*                 8 */            CORE_ADDR address;
> /*                 8 */            const common_block *common_block;
> /*                 8 */            symbol *chain;
>                                } /* total size:    8 bytes */ value;
> vs always-print-offset:
> /*    0      |    32 */    struct general_symbol_info {
> /*    0      |     8 */        const char *name;
> /*    8      |     8 */        union {
> /*    8            8 */            LONGEST ivalue;
> /*    8            8 */            const block *block;
> /*    8            8 */            const gdb_byte *bytes;
> /*    8            8 */            CORE_ADDR address;
> /*    8            8 */            const common_block *common_block;
> /*    8            8 */            symbol *chain;
>                                } /* total size:    8 bytes */ value;
> Note that it can be argued that the version that does _not_ print
> the offsets includes _more_ information, or less noise, because
> with that version it's much easier to not get distracted with
> the offsets of the union fields, which can do nothing about.
> So from that angle, I see value in not printing the offsets
> of union members.

Since it's still not clear whether the offsets should be printed or not
in this case, and I am not a global maintainer, I adjusted the code to
print them and will post the patch as a reply to the v4 e-mail.  This
way you can decide which version is best.


GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible

More information about the Gdb-patches mailing list