Bug 26684 - abidiff says that some bit field elements missing in version of binary with dwarf5 debuginfo.
Summary: abidiff says that some bit field elements missing in version of binary with d...
Status: ASSIGNED
Alias: None
Product: libabigail
Classification: Unclassified
Component: default (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: dodji
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-30 18:18 UTC by William Cohen
Modified: 2020-10-23 11:29 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
the -g -O2 version of the binary (437.85 KB, application/x-troff-man)
2020-09-30 18:18 UTC, William Cohen
Details
the -g -O2 debuginfo (232.34 KB, application/x-sharedlib)
2020-09-30 18:19 UTC, William Cohen
Details
the -gdwarf5 -O2 version of the binary (437.84 KB, application/x-troff-man)
2020-09-30 18:20 UTC, William Cohen
Details
the -gdwarf5 -O2 debuginfo (321.16 KB, application/x-sharedlib)
2020-09-30 18:21 UTC, William Cohen
Details
Files to reproduce the problem (1.34 MB, application/gzip)
2020-10-01 17:00 UTC, William Cohen
Details
subrange_type bounds signedness is given by underlying type (1.78 KB, patch)
2020-10-02 14:42 UTC, Mark Wielaard
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description William Cohen 2020-09-30 18:18:59 UTC
Created attachment 12872 [details]
the -g -O2 version of the binary

I was comparing builds of the libpfm and papi binaries built with different compilers options.  The expectation is that the ABI should be the same for them as they are the same binary.  However, I noticed that some of the bit fields were reports as missing from the binaries compiled with dwarf-5 as the debuginfo format.  The libpfm also mentions some struct fields as static.

Functions changes summary: 0 Removed, 4 Changed, 0 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

4 functions with some indirect sub-type change:

  [C] 'function int pfm_get_event_attr_info(int, int, pfm_os_t, pfm_event_attr_info_t*)' at pfmlib_common.c:2037:1 has some indirect sub-type changes:
    parameter 4 of type 'pfm_event_attr_info_t*' has sub-type changes:
      in pointed to type 'typedef pfm_event_attr_info_t' at pfmlib.h:715:1:
        underlying type 'struct {const char* name; const char* desc; const char* equiv; size_t size; uint64_t code; pfm_attr_t type; int idx; pfm_attr_ctrl_t ctrl; struct {unsigned int is_dfl; unsigned int is_precise; unsigned int is_speculative; unsigned int support_hw_smpl; unsigned int reserved_bits;}; union {uint64_t dfl_val64; const char* dfl_str; int dfl_bool; int dfl_int;};}' at pfmlib.h:693:1 changed:
          type size hasn't changed
          1 data member change:
            anonymous data member at offset 416 (in bits) changed from:
              struct {unsigned int is_dfl; unsigned int is_precise; unsigned int is_speculative; unsigned int support_hw_smpl; unsigned int reserved_bits;}
            to:
              struct {static unsigned int is_dfl; static unsigned int is_precise; static unsigned int is_speculative; static unsigned int support_hw_smpl; static unsigned int reserved_bits;}

  [C] 'function int pfm_get_event_info(int, pfm_os_t, pfm_event_info_t*)' at pfmlib_common.c:1979:1 has some indirect sub-type changes:
    parameter 3 of type 'pfm_event_info_t*' has sub-type changes:
      in pointed to type 'typedef pfm_event_info_t' at pfmlib.h:691:1:
        underlying type 'struct {const char* name; const char* desc; const char* equiv; size_t size; uint64_t code; pfm_pmu_t pmu; pfm_dtype_t dtype; int idx; int nattrs; int reserved; struct {unsigned int is_precise; unsigned int is_speculative; unsigned int support_hw_smpl; unsigned int reserved_bits;};}' at pfmlib.h:674:1 changed:
          type size hasn't changed
          1 data member change:
            anonymous data member at offset 480 (in bits) changed from:
              struct {unsigned int is_precise; unsigned int is_speculative; unsigned int support_hw_smpl; unsigned int reserved_bits;}
            to:
              struct {static unsigned int is_precise; static unsigned int is_speculative; static unsigned int support_hw_smpl; static unsigned int reserved_bits;}

  [C] 'function int pfm_get_perf_event_encoding(const char*, int, perf_event_attr*, char**, int*)' at pfmlib_perf_event.c:395:1 has some indirect sub-type changes:
    parameter 3 of type 'perf_event_attr*' has sub-type changes:
      in pointed to type 'struct perf_event_attr' at perf_event.h:250:1:
        type size hasn't changed
        29 data member deletions:
          'uint64_t perf_event_attr::namespaces', at offset 35 (in bits) at perf_event.h:290:1
          'uint64_t perf_event_attr::write_backward', at offset 36 (in bits) at perf_event.h:289:1
          'uint64_t perf_event_attr::context_switch', at offset 37 (in bits) at perf_event.h:288:1
          'uint64_t perf_event_attr::use_clockid', at offset 38 (in bits) at perf_event.h:287:1
          'uint64_t perf_event_attr::comm_exec', at offset 39 (in bits) at perf_event.h:286:1
          'uint64_t perf_event_attr::mmap2', at offset 40 (in bits) at perf_event.h:285:1
          'uint64_t perf_event_attr::exclude_callchain_user', at offset 41 (in bits) at perf_event.h:284:1
          'uint64_t perf_event_attr::exclude_callchain_kernel', at offset 42 (in bits) at perf_event.h:283:1
          'uint64_t perf_event_attr::exclude_guest', at offset 43 (in bits) at perf_event.h:282:1
          'uint64_t perf_event_attr::exclude_host', at offset 44 (in bits) at perf_event.h:281:1
          'uint64_t perf_event_attr::sample_id_all', at offset 45 (in bits) at perf_event.h:280:1
          'uint64_t perf_event_attr::mmap_data', at offset 46 (in bits) at perf_event.h:279:1
          'uint64_t perf_event_attr::precise_ip', at offset 47 (in bits) at perf_event.h:278:1
          'uint64_t perf_event_attr::watermark', at offset 49 (in bits) at perf_event.h:277:1
          'uint64_t perf_event_attr::task', at offset 50 (in bits) at perf_event.h:276:1
          'uint64_t perf_event_attr::enable_on_exec', at offset 51 (in bits) at perf_event.h:275:1
          'uint64_t perf_event_attr::inherit_stat', at offset 52 (in bits) at perf_event.h:274:1
          'uint64_t perf_event_attr::freq', at offset 53 (in bits) at perf_event.h:273:1
          'uint64_t perf_event_attr::comm', at offset 54 (in bits) at perf_event.h:272:1
          'uint64_t perf_event_attr::mmap', at offset 55 (in bits) at perf_event.h:271:1
          'uint64_t perf_event_attr::exclude_idle', at offset 56 (in bits) at perf_event.h:270:1
          'uint64_t perf_event_attr::exclude_hv', at offset 57 (in bits) at perf_event.h:269:1
          'uint64_t perf_event_attr::exclude_kernel', at offset 58 (in bits) at perf_event.h:268:1
          'uint64_t perf_event_attr::exclude_user', at offset 59 (in bits) at perf_event.h:267:1
          'uint64_t perf_event_attr::exclusive', at offset 60 (in bits) at perf_event.h:266:1
          'uint64_t perf_event_attr::pinned', at offset 61 (in bits) at perf_event.h:265:1
          'uint64_t perf_event_attr::inherit', at offset 62 (in bits) at perf_event.h:264:1
          'uint64_t perf_event_attr::disabled', at offset 63 (in bits) at perf_event.h:263:1
          'uint64_t perf_event_attr::__reserved_1', at offset 320 (in bits) at perf_event.h:291:1

  [C] 'function int pfm_get_pmu_info(pfm_pmu_t, pfm_pmu_info_t*)' at pfmlib_common.c:2111:1 has some indirect sub-type changes:
    parameter 2 of type 'pfm_pmu_info_t*' has sub-type changes:
      in pointed to type 'typedef pfm_pmu_info_t' at pfmlib.h:662:1:
        underlying type 'struct {const char* name; const char* desc; size_t size; pfm_pmu_t pmu; pfm_pmu_type_t type; int nevents; int first_event; int max_encoding; int num_cntrs; int num_fixed_cntrs; struct {unsigned int is_present; unsigned int is_dfl; unsigned int reserved_bits;};}' at pfmlib.h:646:1 changed:
          type size hasn't changed
          1 data member change:
            anonymous data member at offset 416 (in bits) changed from:
              struct {unsigned int is_present; unsigned int is_dfl; unsigned int reserved_bits;}
            to:
              struct {static unsigned int is_present; static unsigned int is_dfl; static unsigned int reserved_bits;}
Comment 1 William Cohen 2020-09-30 18:19:35 UTC
Created attachment 12873 [details]
the -g -O2 debuginfo
Comment 2 William Cohen 2020-09-30 18:20:25 UTC
Created attachment 12874 [details]
the -gdwarf5 -O2 version of the binary
Comment 3 William Cohen 2020-09-30 18:21:02 UTC
Created attachment 12875 [details]
the -gdwarf5 -O2 debuginfo
Comment 4 William Cohen 2020-09-30 18:22:31 UTC
This experiment was using abidiff built from the upstream libabigail repo commit 59610d5572bf8a997b0f490cae20d42f73acf3e1
Comment 5 Mark Wielaard 2020-10-01 16:01:16 UTC
Attachment #2 [details] "the -g -O2 debuginfo" refers to libpfm-4.11.1-1.fc32_gcc_o2__g_.x86_64 as alt file. Could you also attach that one? That will make analyzing the difference slightly easier.
Comment 6 William Cohen 2020-10-01 17:00:11 UTC
Created attachment 12882 [details]
Files to reproduce the problem

Steps to reproduce the problem from the tarball file.

tar xvz bz26684.tar.gz
cd bz26684
abidiff g/libpfm.so.4.11.1 dwarf5/libpfm.so.4.11.1
Comment 7 Mark Wielaard 2020-10-01 20:50:22 UTC
There are a couple of issues in the libabigail code.

It has some code for die_unsigned_constant_attribute which relies on specific FORMs which doesn't handle DW_FORM_implicit_const (and technically DW_FORM_implicit_const represents a signed value). Ideally libabigail shouldn't have to deal with forms directly but should be able to rely on the dwarf_attr functions in libdw to get attribute values so it doesn't have to deal with any particular data representation. In particular handling of 
DW_AT_byte_size and DW_AT_bit_size should probably simply use and dwarf_attr plus dwarf_formudata.

Secondly in libabigail only handles DW_AT_data_member_location, but not DW_AT_data_bit_offset. Technically DW_AT_data_bit_offset is already in DWARF4, but gcc only outputs it for DWARF5 (from gcc/dwarf2out.c):

      /* While DW_AT_data_bit_offset has been added already in DWARF4,
         e.g. GDB only added support to it in November 2016.  For DWARF5
         we need newer debug info consumers anyway.  We might change this
         to dwarf_version >= 4 once most consumers catched up.  */

Note that if there is a DW_AT_data_bit_offset instead of a DW_AT_data_member_location then there will be no DW_AT_byte_size and no DW_AT_bit_offset.
Comment 8 Mark Wielaard 2020-10-01 20:54:11 UTC
Simpler example that shows the issue:

struct pea
{
  int type;
  long a:1, b:1, c:1;
};

struct pea p;

gcc -gdwarf-4:

 [    1d]    structure_type       abbrev: 2
             name                 (string) "pea"
             byte_size            (data1) 8
             decl_file            (data1) bf.c (1)
             decl_line            (data1) 1
             decl_column          (data1) 8
             sibling              (ref4) [    62]
 [    2a]      member               abbrev: 3
               name                 (strp) "type"
               decl_file            (data1) bf.c (1)
               decl_line            (data1) 3
               decl_column          (data1) 7
               type                 (ref4) [    62]
               data_member_location (data1) 0
 [    37]      member               abbrev: 4
               name                 (string) "a"
               decl_file            (data1) bf.c (1)
               decl_line            (data1) 4
               decl_column          (data1) 8
               type                 (ref4) [    69]
               byte_size            (data1) 8
               bit_size             (data1) 1
               bit_offset           (data1) 31
               data_member_location (data1) 0
 [    45]      member               abbrev: 4
               name                 (string) "b"
               decl_file            (data1) bf.c (1)
               decl_line            (data1) 4
               decl_column          (data1) 13
               type                 (ref4) [    69]
               byte_size            (data1) 8
               bit_size             (data1) 1
               bit_offset           (data1) 30
               data_member_location (data1) 0
 [    53]      member               abbrev: 4
               name                 (string) "c"
               decl_file            (data1) bf.c (1)
               decl_line            (data1) 4
               decl_column          (data1) 18
               type                 (ref4) [    69]
               byte_size            (data1) 8
               bit_size             (data1) 1
               bit_offset           (data1) 29
               data_member_location (data1) 0

gcc -gdwarf-5:

 [    1e]    structure_type       abbrev: 3
             name                 (string) "pea"
             byte_size            (data1) 8
             decl_file            (data1) bf.c (1)
             decl_line            (data1) 1
             decl_column          (data1) 8
             sibling              (ref4) [    54]
 [    2b]      member               abbrev: 4
               name                 (strp) "type"
               decl_file            (data1) bf.c (1)
               decl_line            (data1) 3
               decl_column          (data1) 7
               type                 (ref4) [    54]
               data_member_location (data1) 0
 [    38]      member               abbrev: 1
               name                 (string) "a"
               decl_file            (implicit_const) bf.c (1)
               decl_line            (implicit_const) 4
               decl_column          (data1) 8
               type                 (ref4) [    5b]
               bit_size             (implicit_const) 1
               data_bit_offset      (data1) 32
 [    41]      member               abbrev: 1
               name                 (string) "b"
               decl_file            (implicit_const) bf.c (1)
               decl_line            (implicit_const) 4
               decl_column          (data1) 13
               type                 (ref4) [    5b]
               bit_size             (implicit_const) 1
               data_bit_offset      (data1) 33
 [    4a]      member               abbrev: 1
               name                 (string) "c"
               decl_file            (implicit_const) bf.c (1)
               decl_line            (implicit_const) 4
               decl_column          (data1) 18
               type                 (ref4) [    5b]
               bit_size             (implicit_const) 1
               data_bit_offset      (data1) 34
Comment 9 Mark Wielaard 2020-10-01 21:55:39 UTC
P.S. Note that the actual bit offset for the different attributes is defined differently:

DW_AT_bit_offset: The bit offset attribute describes the offset in bits of the high order bit of a value of the given type from the high order bit of the storage unit used to contain that value.

DW_AT_data_bit_offset: the value is an integer constant that specifies the number of bits from the beginning of the containing entity to the beginning of the data member.

DWARF5 has some example for big and little endian in D.2.8 C/C++ Bit-Field Examples
Comment 10 Mark Wielaard 2020-10-02 14:42:11 UTC
Created attachment 12885 [details]
subrange_type bounds signedness is given by underlying type

I was wrong about this:

> It has some code for die_unsigned_constant_attribute which relies on
> specific FORMs which doesn't handle DW_FORM_implicit_const (and technically
> DW_FORM_implicit_const represents a signed value). Ideally libabigail
> shouldn't have to deal with forms directly but should be able to rely on the
> dwarf_attr functions in libdw to get attribute values so it doesn't have to
> deal with any particular data representation. In particular handling of 
> DW_AT_byte_size and DW_AT_bit_size should probably simply use and dwarf_attr
> plus dwarf_formudata.

It turns out the code that directly inspects the DW_FORMs and (signedness) is only used by the build_subrange_type code to properly initialize the bound_values. I believe that code is actually wrong. The signedness of the bound_values isn't determined by the specific form used, but by the signedness of the underlying type (if it has one). The attached patch determines the signedness by looking at the underlying type and removes the code that directly checks specific forms.
Comment 11 Mark Wielaard 2020-10-02 14:46:43 UTC
The actual change from DW_AT_data_member_location plus DW_AT_bit_offset to DW_AT_data_bit_offset is harder than it looks because the representation of the offset changes for little endian systems. And arguably currently libabigail handles DW_AT_bit_offset incorrect for little endian systems... tricky.
Comment 12 Mark Wielaard 2020-10-02 14:52:41 UTC
Another random observation, the handling of strings by interpreting the DW_FORM is also a little ewww IMHO. Note that at some point dwz in DWAR5 mode will generate DW_FORM_strp_sup instead of DW_FORM_GNU_strp_alt.
Comment 13 dodji 2020-10-20 10:26:27 UTC
I have put together a patch to support the DW_AT_data_bit_offset attribute and also to interpret the DW_AT_bit_offset attribute correctly on little endian machines.

It's available at 
https://sourceware.org/git/?p=libabigail.git;a=commit;h=4e9bfc1212c017cc18f0044b76c37ca3be9ba625, in the PR26684 branch which can be browsed at https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/PR26684.

With this patch libabigail should be able to compare two binaries resulting from the same program, one being compiled with DWARF4 and the other one with DWARF5.  The result of the comparison should yield the empty set.

The problem now is that abixml files emitted using a previous version of Libabigail that was doing the wrong interpretation of DW_AT_bit_offset will now be incompatible with abixml files emitted with this patch.

So, I am thinking about introducing a new abixml version (2.0?) which would be incompatible with the current 1.0 one.  That way, I can add code to prevent comparing 1.0 abixml versions against 2.0 ones, as well as comparing 1.0 abixml files against binaries loaded with the new code.  I'll be working on that now.
Comment 14 dodji 2020-10-23 08:24:57 UTC
The PR26684 branch now has a patch which bumps the version number of the ABIXML format to "2.0", which is incompatible with the current "1.0" format.

The patch also modifies abidiff to make it avoid comparing input files with incompatible version numbers.  So abidiff will now refuse to compare an ABIXML file which has the "1.0" format against a corpus originating from analyzing a binary using the fixed interpretation of DW_AT_bit_offset.  Users will have to re-generate their ABIXML files using a current abidw, if they want to use this version of libabigail.

Of course, this is not yet merged into master.

I welcome your feebacks on this approach, as always.
Comment 15 Frank Ch. Eigler 2020-10-23 10:35:06 UTC
Just some vague opinion from the backward compatibility peanut galary:

Was the misinterpretation ambiguous / information-losing?  If not, then it does not seem necessary to do a major flag-day version bump, because abidiff could read a v1 xml file, convert from its buggy bit numbering scheme to the correct one.  It could treat the data (or even rewrite it) as though it were a v2 correct one.  Heck, if the buggy data is straightforward to correct in xml, could put the correct bit# into a new-name xml element/attribute nearby, rather than requiring a top level version bump?
Comment 16 dodji 2020-10-23 11:29:01 UTC
"fche at redhat dot com" <sourceware-bugzilla@sourceware.org> writes:

> --- Comment #15 from Frank Ch. Eigler <fche at redhat dot com> ---

Thank you Frank for taking the time to reply to this!

> Just some vague opinion from the backward compatibility peanut galary:
>
> Was the misinterpretation ambiguous / information-losing?

Sadly, I am afraid it was :-( (unless I am missing something).

For little endian machines, libabigail was wrongly interpreting
DW_AT_bit_offset as being the same value as for big endian machines.  So
the resulting data member offset we were getting is wrong.  And that
wrong number is what's saved in the abixml file.

At the abixml level, we don't save the other low level details of the
bitfield layout that would help us convert that buggy data member offset
into the correct one.  We just save the offset of the data member and
its type.  Now in hindsight, I am thinking maybe I should have save some
of that information ...

> If not, then it does not seem necessary to do a major flag-day version
> bump, because abidiff could read a v1 xml file, convert from its buggy
> bit numbering scheme to the correct one.  It could treat the data (or
> even rewrite it) as though it were a v2 correct one.

> Heck, if the buggy data is straightforward to correct in xml, could
> put the correct bit# into a new-name xml element/attribute nearby,
> rather than requiring a top level version bump?

Yes, it would have been super cool to be able to do that ...