[PATCH 04/13] libctf, ld: prohibit doing things with forwards prohibited in C

Nick Alcock nick.alcock@oracle.com
Mon Dec 28 02:04:30 GMT 2020


On 27 Dec 2020, Nick Alcock via Binutils uttered the following:

> On 18 Dec 2020, Nick Alcock via Binutils uttered the following:
>
>> returned when you try to get the size or alignment of forwards: we also
>> return it when you try to construct CTF dicts that do invalid things
>> with forwards, namely adding a forward member to a struct or union or
>> making an array of elements of forward type.
>
> ... aaand this one is firing on real code. I'll figure out why tomorrow.

So I couldn't stop thinking about this.  The problem is intrinsic to the
design of the deduplicator and will take a format rev to fix properly,
but we can do a good-enough fix that will get things working again.

The problem is this. Imagine you have these TUs:

A.c:
  struct A;
  struct B { struct A *a; };
  struct A { struct B b; long foo; long bar; struct B b2; };

B.c:
  struct A;
  struct B { struct A *a; };
  struct A { struct B b; int foo; struct B b2; };

Now struct A is ambiguously defined, so the deduplicator will emit a
forward for A in the shared CTF dict .ctf and emit two child dicts
for A.c and B.c with definitions of the real structs A in them:

Contents of CTF section .ctf:
  Types:
     1: struct B (size 0x8)
        [0x0] (ID 0x1) (kind 6) struct B (aligned at 0x8)
            [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
     2: struct A (size 0x6)
        [0x0] (ID 0x2) (kind 9) struct A (aligned at 0x6)
     3: struct A * (size 0x8) -> 2: struct A (size 0x6)
        [0x0] (ID 0x3) (kind 3) struct A * (aligned at 0x8)
     4: long int [0x0:0x40] (size 0x8)
        [0x0] (ID 0x4) (kind 1) long int:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
     5: int [0x0:0x20] (size 0x4)
        [0x0] (ID 0x5) (kind 1) int:32 (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)

CTF archive member: .../ambiguous-struct-A.c:
  Types:
     80000001: struct A (size 0x20)
        [0x0] (ID 0x80000001) (kind 6) struct A (aligned at 0x8)
            [0x0] (ID 0x1) (kind 6) struct B b (aligned at 0x8)
                [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
            [0x40] (ID 0x4) (kind 1) long int foo:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0x80] (ID 0x4) (kind 1) long int bar:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0xc0] (ID 0x1) (kind 6) struct B b2 (aligned at 0x8)
                [0xc0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)

CTF archive member: .../ambiguous-struct-B.c:
  Types:
     80000001: struct A (size 0x18)
        [0x0] (ID 0x80000001) (kind 6) struct A (aligned at 0x8)
            [0x0] (ID 0x1) (kind 6) struct B b (aligned at 0x8)
                [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
            [0x40] (ID 0x5) (kind 1) int foo:32 (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)
            [0x80] (ID 0x1) (kind 6) struct B b2 (aligned at 0x8)
                [0x80] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)

Note that the two structs A are different sizes, with corresponding
members at different offsets. But... it's perfectly possible for there
to be an array of one of those structs A, and that array is
unambiguously defined, so it's going to be promoted to the .ctf section:

     6: struct A [50] (size 0x12c)
        [0x0] (ID 0x6) (kind 4) struct A [50] (aligned at 0x6)

... and after the deduplicator's got through with it, that's an array
of, uh... a forward. Which has no size or alignment. Oh dear. And it's
not like it even *can* carry a size: the sizes of both struct A's are
different!

Now we can't stop the deduplicator doing this promotion-to-forward stuff
-- it's the way it breaks cycles. In format v4 we should at least make
it clearer to users what is going on by having the deduplicator emit a
new type kind for these forwards, an 'ambiguous type', CTF_K_AMBIGUOUS,
which is identical to a forward, but the different type kind can make it
possible for users to tell that this is actually a stand-in for an
ambiguous structure or union type which can be found in two or more
per-CU dicts (and we can add a function to the libctf API to return all
the relevant types from the various dicts easily).

Stopping the crashes for now is a simple matter of allowing the addition
of incomplete array types again (even though you can't get their sizes
and/or alignment and you'll get an ECTF_INCOMPLETE if you try, ugh:
ugly, but doesn't cause actual misbehaviour), and adjusting
ctf_add_member_offset so that it does not complain if ctf_type_size or
ctf_type_alignment yields ECTF_INCOMPLETE, but *does* complain if the
next type you try to add to that struct does not have an explicit offset
specified (which would require libctf to get the size and/or alignment
of the incomplete member). The deduplicator always specifies explicit
offsets, so the link-time crashes should go away.

I'll try that in the next day or two, and add a test for this stuff.
(Holiday? I scorn holiday! also a release *is* impending and I'd like
this to be working well enough to get in.)


More information about the Binutils mailing list