Bug 28584 - Clang doesn't emit DW_AT_external for some global variables so they are dropped by libabigail
Summary: Clang doesn't emit DW_AT_external for some global variables so they are dropp...
Status: RESOLVED FIXED
Alias: None
Product: libabigail
Classification: Unclassified
Component: default (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Dodji Seketeli
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-11 12:15 UTC by gprocida
Modified: 2021-11-16 15:30 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2021-11-15 00:00:00


Attachments
test case (1.73 KB, application/x-compressed-tar)
2021-11-11 12:15 UTC, gprocida
Details

Note You need to log in before you can comment on or make changes to this bug.
Description gprocida 2021-11-11 12:15:03 UTC
Created attachment 13776 [details]
test case

Consider these two code fragments.

namespace N {                                                                                                                                                                                                                                                                             
struct S { static int D; };                                                                                                                                                                                                                                                               
}                                                                                                                                                                                                                                                                                         
int N::S::D = 17;                                                                                                                                                                                                                                                                            


namespace N {                                                                                                                                                                                                                                                                             
struct S { static int D; };                                                                                                                                                                                                                                                               
int S::D = 17;                                                                                                                                                                                                                                                                            
}                                                                                                                                                                                                                                                                                         

Looking at the DWARF, in all cases the variable declaration linked to the symbol is a separate entity (with just a linkage name) from the member within the struct.

While GCC produces essentially identical DWARF for both fragments above, Clang does something different for the second case. For GCC, this entity exists outside the namespace scope. For Clang it exists in the scope it appeared in the file (global and namespace N scope, for the two examples, respectively).

abidw doesn't appear to handle Clang's inner-scoped DW_TAG_variables. I express no opinion as to which is wrong. :-)

I've attached a full test case.
Comment 1 gprocida 2021-11-12 13:09:02 UTC
The XML is much improved if --load-all-types is passed. There is just an extra is-non-reachable='yes' attribute on element describing struct S.

It looks like the called_from_public_decl logic doesn't account for the possibility of arranging the DIEs the way Clang has done.
Comment 2 gprocida 2021-11-12 14:05:55 UTC
On realistic inputs, --load-all-types only results in symbol types for about 1/6 of the untyped symbols that actually appear in the DWARF as a DW_AT_linkage_name attribute.
Comment 3 Dodji Seketeli 2021-11-15 16:42:58 UTC
(In reply to gprocida from comment #0)
> Created attachment 13776 [details]
> test case
> 
> Consider these two code fragments.
> 
> namespace N {                                                               
> 
> struct S { static int D; };                                                 
> 
> }                                                                           
> 
> int N::S::D = 17;                                                           
> 
> 
> 
> namespace N {                                                               
> 
> struct S { static int D; };                                                 
> 
> int S::D = 17;                                                              
> 
> }                                                                           
> 
> 
> Looking at the DWARF, in all cases the variable declaration linked to the
> symbol is a separate entity (with just a linkage name) from the member
> within the struct.
> 
> While GCC produces essentially identical DWARF for both fragments above,
> Clang does something different for the second case. For GCC, this entity
> exists outside the namespace scope. For Clang it exists in the scope it
> appeared in the file (global and namespace N scope, for the two examples,
> respectively).
> 
> abidw doesn't appear to handle Clang's inner-scoped DW_TAG_variables. I
> express no opinion as to which is wrong. :-)

My understanding is that the issue is due to the DW_TAG_variable DIE lacks a DW_AT_external attribute.  So libabigail drops it on the floor as it considers it as being non-exported.
 
> I've attached a full test case.

Thank you for that.

I have applied a patch that should handle this, hopefully.

It's at https://sourceware.org/git/?p=libabigail.git;a=commit;h=5ac010cc9ba1e2ef0d109617763398ae119d394b.
Comment 4 gprocida 2021-11-15 21:05:13 UTC
Hi.

I don't think DW_AT_external is the real cause.

For the second example below. Here are what GCC 10 and Clang 11 both give me as debug_info. They have exactly one DW_AT_external attribute.

GCC

COMPILE_UNIT<header overall offset = 0x00000000>:
< 0><0x0000000b>  DW_TAG_compile_unit
                    DW_AT_producer              GNU C++14 10.3.0 -mtune=generic -march=x86-64 -g -fasynchronous-unwind-tables
                    DW_AT_language              DW_LANG_C_plus_plus
                    DW_AT_name                  smv.cc
                    DW_AT_comp_dir              /usr/local/google/home/gprocida/dev/stg/static_member_variable
                    DW_AT_stmt_list             0x00000000

LOCAL_SYMBOLS:
< 1><0x0000001d>    DW_TAG_namespace
                      DW_AT_name                  N
                      DW_AT_decl_file             0x00000001 /usr/local/google/home/gprocida/dev/stg/static_member_variable/smv.cc
                      DW_AT_decl_line             0x00000001
                      DW_AT_decl_column           0x0000000b
                      DW_AT_sibling               <0x0000003a>
< 2><0x00000027>      DW_TAG_structure_type
                        DW_AT_name                  S
                        DW_AT_byte_size             0x00000001
                        DW_AT_decl_file             0x00000001 /usr/local/google/home/gprocida/dev/stg/static_member_variable/smv.cc
                        DW_AT_decl_line             0x00000002
                        DW_AT_decl_column           0x00000008
< 3><0x0000002e>        DW_TAG_member
                          DW_AT_name                  D
                          DW_AT_decl_file             0x00000001 /usr/local/google/home/gprocida/dev/stg/static_member_variable/smv.cc
                          DW_AT_decl_line             0x00000002
                          DW_AT_decl_column           0x00000017
                          DW_AT_type                  <0x0000003a>
                          DW_AT_external              yes(1)
                          DW_AT_declaration           yes(1)
< 1><0x0000003a>    DW_TAG_base_type
                      DW_AT_byte_size             0x00000004
                      DW_AT_encoding              DW_ATE_signed
                      DW_AT_name                  int
< 1><0x00000041>    DW_TAG_variable
                      DW_AT_specification         <0x0000002e>
                      DW_AT_decl_line             0x00000003
                      DW_AT_decl_column           0x00000005
                      DW_AT_linkage_name          _ZN1N1S1DE
                      DW_AT_location              len 0x0009: 0x030000000000000000: 
                          DW_OP_addr 0x00000000

Clang

COMPILE_UNIT<header overall offset = 0x00000000>:
< 0><0x0000000b>  DW_TAG_compile_unit
                    DW_AT_producer              Debian clang version 11.1.0-4+build1
                    DW_AT_language              DW_LANG_C_plus_plus_14
                    DW_AT_name                  smv.cc
                    DW_AT_stmt_list             0x00000000
                    DW_AT_comp_dir              /usr/local/google/home/gprocida/dev/stg/static_member_variable

LOCAL_SYMBOLS:
< 1><0x0000001e>    DW_TAG_namespace
                      DW_AT_name                  N
< 2><0x00000023>      DW_TAG_variable
                        DW_AT_specification         <0x0000003f>
                        DW_AT_location              len 0x0009: 0x030000000000000000: 
                            DW_OP_addr 0x00000000
                        DW_AT_linkage_name          _ZN1N1S1DE
< 2><0x00000036>      DW_TAG_structure_type
                        DW_AT_calling_convention    DW_CC_pass_by_value
                        DW_AT_name                  S
                        DW_AT_byte_size             0x00000001
                        DW_AT_decl_file             0x00000001 /usr/local/google/home/gprocida/dev/stg/static_member_variable/smv.cc
                        DW_AT_decl_line             0x00000002
< 3><0x0000003f>        DW_TAG_member
                          DW_AT_name                  D
                          DW_AT_type                  <0x0000004c>
                          DW_AT_decl_file             0x00000001 /usr/local/google/home/gprocida/dev/stg/static_member_variable/smv.cc
                          DW_AT_decl_line             0x00000002
                          DW_AT_external              yes(1)
                          DW_AT_declaration           yes(1)
< 1><0x0000004c>    DW_TAG_base_type
                      DW_AT_name                  int
                      DW_AT_encoding              DW_ATE_signed
                      DW_AT_byte_size             0x00000004
Comment 5 Dodji Seketeli 2021-11-16 07:34:42 UTC
Hello,

"gprocida at google dot com" <sourceware-bugzilla@sourceware.org>
writes:

> https://sourceware.org/bugzilla/show_bug.cgi?id=28584
>
> gprocida at google dot com changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>          Resolution|FIXED                       |---
>              Status|RESOLVED                    |REOPENED
>
> --- Comment #4 from gprocida at google dot com ---
> Hi.
>
> I don't think DW_AT_external is the real cause.
>
> For the second example below. Here are what GCC 10 and Clang 11 both give me as
> debug_info. They have exactly one DW_AT_external attribute.


I'll use a a DWARF dump emitted by elfutils, which is what
libabigail uses, so I am sure it shows me what libabigail sees.  I don't
know what DWARF dumper you have used.

When I run eu-readelf --debug-dump=info on the smv.o binary you
attached, here is what I am getting:

DWARF section [ 5] '.debug_info' at offset 0x91:
 [Offset]
 Compilation unit at offset 0:
 Version: 4, Abbreviation section offset: 0, Address size: 8, Offset size: 4
 [     b]  compile_unit         abbrev: 1
           producer             (strp) "Debian clang version 11.1.0-4+build1"
           language             (data2) C_plus_plus_14 (33)
           name                 (strp) "smv.cc"
           stmt_list            (sec_offset) 0
           comp_dir             (strp) "/usr/local/google/home/gprocida/dev/stg/static_member_variable"
 [    1e]    namespace            abbrev: 2
             name                 (strp) "N"
 [    23]      variable             abbrev: 3
               specification        (ref4) [    3f]
               location             (exprloc) 
                [ 0] addr .data+0 <_ZN1N1S1DE>
               linkage_name         (strp) "_ZN1N1S1DE"

[...]

So, this is the variable DIE (the one at offset 0x23) that matters.
It's the one that carries a reference to the variable ELF symbol which
address is ".data+0 <_ZN1N1S1DE>".  Notice how it doesn't carry any
DW_AT_external attribute.  From its DW_AT_specification attribute
however, we see that it's a concrete "instance" of the member variable
DIE specified at Ox36 below:


 [    36]      structure_type       abbrev: 4
               calling_convention   (data1) pass_by_value (5)
               name                 (strp) "S"
               byte_size            (data1) 1
               decl_file            (data1) smv.cc (1)
               decl_line            (data1) 2
 [    3f]        member               abbrev: 5
                 name                 (strp) "D"
                 type                 (ref4) [    4c]
                 decl_file            (data1) smv.cc (1)
                 decl_line            (data1) 2
                 external             (flag_present) yes
                 declaration          (flag_present) yes

Here ^^^^.  This one carries a DW_AT_external declaration.

GCC (version 11.2.1) however emits this:

[...]

At offset 46, we have this DIE:

 [    46]    variable             abbrev: 6
             specification        (ref4) [    2f]
             decl_line            (data1) 3
             decl_column          (data1) 5
             location             (exprloc) 
              [ 0] addr .data+0 <_ZN1N1S1DE>

So, this is the variable which carries the reference to the ELF symbol.
It's the one we care about.  Notice how it carries a DW_AT_specification
property.

It's a concrete instance to the specification DIE at offset 0x2f, which
we can see below:

DWARF section [ 4] '.debug_info' at offset 0x44:
 [Offset]
 Compilation unit at offset 0:
 Version: 5, Abbreviation section offset: 0, Address size: 8, Offset size: 4
 Unit type: compile (1)
 [     c]  compile_unit         abbrev: 1
           producer             (strp) "GNU C++17 11.2.1 20210728 (Red Hat 11.2.1-1) -mtune=generic -march=x86-64 -g"
           language             (data1) C_plus_plus_14 (33)
           name                 (line_strp) "../smv.cc"
           comp_dir             (line_strp) "/home/dodji/git/libabigail/PR28584/prtests/static_member_variable/gcc-binary"
           stmt_list            (sec_offset) 0
 [    1e]    namespace            abbrev: 2
             name                 (string) "N"
             decl_file            (data1) smv.cc (1)
             decl_line            (data1) 1
             decl_column          (data1) 11
             sibling              (ref4) [    3f]
 [    28]      structure_type       abbrev: 3
               name                 (string) "S"
               byte_size            (data1) 1
               decl_file            (data1) smv.cc (1)
               decl_line            (data1) 2
               decl_column          (data1) 8
 [    2f]        variable             abbrev: 4
                 name                 (string) "D"
                 decl_file            (data1) smv.cc (1)
                 decl_line            (data1) 2
                 decl_column          (data1) 23
                 linkage_name         (strp) "_ZN1N1S1DE"
                 type                 (ref4) [    3f]
                 external             (flag_present) yes
                 declaration          (flag_present) yes

Here ^^^^^.

Stricto sensu, I don't think Clang's DWARF is wrong.  Libabigail however
was relying on the DW_AT_external to be present on the concrete instance
it sees, even though the concrete instance references an ELF symbol that
is publicly exported.  I think there is enough information in there for
libabigail to make the right choice, which I hope it does now.

Besides, does this patch solve your problem in your environment or not?
(just curious.

I hope this helps.
Comment 6 gprocida 2021-11-16 10:57:12 UTC
FTR, I was using Debian's dwarfdump from dwarfutils.

Architecture: amd64
Source: dwarfutils
Version: 20210528-1
Depends: libc6 (>= 2.14), libdwarf1 (>= 20210528), libelf1 (>= 0.131)
Description: utility to dump DWARF debug information from ELF objects
 Dwarfdump is an application that can print the DWARF debugging
 information of an ELF object file in a human-readable form. It can
 also be used to check and validate manipulated DWARF sections.
 .
 This utility is part of dwarfutils.
Homepage: https://www.prevanders.net/dwarf.html

I not that familiar with DWARF, but I was reading both the GCC and Clang DWARF as having two linked DIEs, one of which referred (via specification) to the other (containing external).

The real difference is just the scope at which the one DIE appeared.

As regards whether the change improves the library ABIs:

1594 extra symbols are now typed (yay!)

27124 symbols remain untyped, 21144 of which don't appear in the DWARF (as a linkname) at all - Clang or some other build issue

This leaves 5980 where they are there in the debug info (as a linkname) but not appearing in the ABI. Could be a Clang bug or Clang/libabigail disagreement.

I'll continue to look into these 5980 and see if I can come up with small test cases to report.
Comment 7 Dodji Seketeli 2021-11-16 14:08:47 UTC
gprocida at google dot com via Libabigail <libabigail@sourceware.org> a
écrit:

> https://sourceware.org/bugzilla/show_bug.cgi?id=28584
>
> --- Comment #6 from gprocida at google dot com ---
> FTR, I was using Debian's dwarfdump from dwarfutils.
>
> Architecture: amd64
> Source: dwarfutils
> Version: 20210528-1
> Depends: libc6 (>= 2.14), libdwarf1 (>= 20210528), libelf1 (>= 0.131)
> Description: utility to dump DWARF debug information from ELF objects
>  Dwarfdump is an application that can print the DWARF debugging
>  information of an ELF object file in a human-readable form. It can
>  also be used to check and validate manipulated DWARF sections.
>  .
>  This utility is part of dwarfutils.
> Homepage: https://www.prevanders.net/dwarf.html

Thanks.

> I not that familiar with DWARF, but I was reading both the GCC and Clang DWARF
> as having two linked DIEs, one of which referred (via specification) to the
> other (containing external).
>
> The real difference is just the scope at which the one DIE appeared.

Even when you look at the behavious of the DWARF reader in the debugger,
that difference of scope of the concrete-instance DIE doesn't matter.
To determine the real scope of the variable represented by the DIE, what
matters is the scope of the specification DIE of that concrete
instance.  And that is what libabigail takes into account.

> As regards whether the change improves the library ABIs:
>
> 1594 extra symbols are now typed (yay!)

That's progress, I guess.

> 27124 symbols remain untyped, 21144 of which don't appear in the DWARF (as a
> linkname) at all - Clang or some other build issue
>
> This leaves 5980 where they are there in the debug info (as a linkname) but not
> appearing in the ABI. Could be a Clang bug or Clang/libabigail
> disagreement.

> I'll continue to look into these 5980 and see if I can come up with small test
> cases to report.

OK.  So maybe we can close this bug then and open a new meta one with
some broader ?

Thanks!
Comment 8 gprocida 2021-11-16 15:30:44 UTC
Yes. I think this one can be closed. I'll open another if I have anything concrete.