Bug 21296

Summary: naive comparison of names of template instantiations leads to false postives
Product: libabigail Reporter: Jonathan Wakely <jwakely.gcc>
Component: defaultAssignee: Dodji Seketeli <dodji>
Status: NEW ---    
Severity: enhancement CC: libabigail, maennich, woodard
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed: 2017-07-17 00:00:00
Attachments: testcase

Description Jonathan Wakely 2017-03-23 19:37:58 UTC
Created attachment 9935 [details]
testcase

Using GCC 6 and Clang (any version) to compile the attached code into two shared libs produces libraries that crash abigail:

clang++ -std=c++14  -o libclang.so -shared -fPIC  clanggcc.cxx  -g
g++   -o libgcc.so -shared -fPIC  clanggcc.cxx  -g

abidiff libgcc.so libclang.so  
Functions changes summary: 0 Removed (1 filtered out), 2 Changed (9 filtered out), 0 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

2 functions with some indirect sub-type change:

  [C]'method STR::~STR(int)' at clanggcc.cxx:6:1 has some indirect sub-type changes:
    linkage names of method STR::~STR(int)
    changed from '_ZN3STRD1Ev, _ZN3STRD2Ev' to '_ZN3STRD2Ev'

    name of symbol changed from _ZN3STRD1Ev to _ZN3STRD2Ev
    parameter 1 of type 'int' was removed


  [C]'function std::tuple<STR&&> my_forward_as_tuple<STR>(STR&&)' at clanggcc.cxx:225:1 has some indirect sub-type changes:
    return type changed:
      type name changed from 'std::tuple<STR&&>' to 'std::tuple<STR &&>'
      type size hasn't changed

      1 base class deletion:
        struct std::_Tuple_impl<0ul, STR&&> at clanggcc.cxx:119:1
      1 base class insertion:
        struct std::_Tuple_impl<0, STR &&> at clanggcc.cxx:119:1
abidiff: ../../src/abg-comparison.cc:8205: virtual void abigail::comparison::fn_parm_diff::report(std::ostream&, const string&) const: Assertion `get_type_diff() && get_type_diff()->to_be_reported()' failed.
Aborted (core dumped)


If GCC trunk is used instead of GCC 6 then abigail doesn't crash, but still reports the same diferences, which appear to be bogus.
Comment 1 Ben Woodard 2017-03-23 20:55:25 UTC
jwakely and I were discussing this, this morning.

We think that there are actually three problems:
1) the abidiff crash
2) the second changed function seems to only be different with regards to whitespace. We think that whitespace differences should be ignored. Can you make the evaluative function ignore the change in the whitespace.
3) libabigail seems to miss the true nature of the ABI change which is that one of the parameters is passed on the stack while the other is passed on a register. We believe that GCC implements the correct calling convention according to the Itanium C++ ABI.
See: https://bugs.llvm.org//show_bug.cgi?id=23034 I haven't looked at the DWARF yet but jwakely seemed to say that he wasn't surprised that libabigail missed detecting this ABI change because the DWARF was insufficient. If that is in fact the case, then we probably need to file a couple of BZ's to improve the DWARF on GCC and CLAMG so that libabigail has the information needed to address this.
Comment 2 Dodji Seketeli 2017-03-24 16:35:27 UTC
I have applied the patch
https://sourceware.org/git/gitweb.cgi?p=libabigail.git;a=commit;h=0c820488d4a161f1bd0d1fc3b70bb2518130512b
to the master branch to fix the abort.

Below are my comments for the other issues.

> the second changed function seems to only be different with regards to
> whitespace. We think that whitespace differences should be
> ignored. Can you make the evaluative function ignore the change in the
> whitespace.

[...]

>   [C]'function std::tuple<STR&&> my_forward_as_tuple<STR>(STR&&)' at
> clanggcc.cxx:225:1 has some indirect sub-type changes:
>     return type changed:
>       type name changed from 'std::tuple<STR&&>' to 'std::tuple<STR &&>'

It's true that here, the name of the type changed from 'tuple<STR&&>' to
'tuple<STR &&>', due to the white space addition.

Today, the type comparison engine considers two types with different
names as being different.

Ideally, clang and gcc should emit type names that are normalized
somehow, so that we can compare them.

I guess libabigail could indeed normalize the type names, but thas is
more involved than just stripping white spaces.

For instance:

>
>       1 base class deletion:
>         struct std::_Tuple_impl<0ul, STR&&> at clanggcc.cxx:119:1
>       1 base class insertion:
>         struct std::_Tuple_impl<0, STR &&> at clanggcc.cxx:119:1

Here, in the type name "_Tuple_impl<0ul, STR&&>", the '0ul' constant
literal changed to become the constant literal '0'.  Both are
equivalent.  But recognizing this is different from the whitespace case
above.

I think we can (and should) do better than what we are doing.

That is, we should try harder to normalize type names before comparing
them.  I guess that improvement could even use its own tracking bug.

[...]

> 3) libabigail seems to miss the true nature of the ABI change which is that one
> of the parameters is passed on the stack while the other is passed on a
> register. We believe that GCC implements the correct calling convention
> according to the Itanium C++ ABI.

To date, Libabigail doesn't look at the location of function parameters.
So changes in argument passing conventions are not detected at the
moment.

I think we could add that feature, but that would be a project in its
own.  Actually, there is an opened enhancement request for this
https://sourceware.org/bugzilla/show_bug.cgi?id=19949.


> See: https://bugs.llvm.org//show_bug.cgi?id=23034 I haven't looked at the DWARF
> yet but jwakely seemed to say that he wasn't surprised that libabigail missed
> detecting this ABI change because the DWARF was insufficient.

The reason why libabigail missed it is that it's not yet trying to see
it :-)

But Jonathan is right that just looking at the DWARF as it is today will
not necessarily be enough.

For instance, let's look at the DWARF of the function
"my_forward_as_tuple<STR>" by generated libclang.so.  I think it's that
one you are talking about.

It says:

 [   498]    subprogram
             low_pc               (addr) +0x00000000000010b0 <_Z19my_forward_as_tupleIJ3STREESt5tupleIJDpOT_EES4_>
             high_pc              (data4) 57 (+0x00000000000010e9)
             frame_base           (exprloc) 
              [   0] reg6
             linkage_name         (strp) "_Z19my_forward_as_tupleIJ3STREESt5tupleIJDpOT_EES4_"
             name                 (strp) "my_forward_as_tuple<STR>"
[...]
 [   4b5]      formal_parameter
               location             (exprloc) 
                [   0] fbreg -16
               name                 (strp) "__args"

If I read that correctly, it says that the parameter of the
my_forward_as_tuple<STR> function is to be found 16 bytes after the
"Frame Base" pointer.  That's the meaning of the "fbreg -16" value of
the DW_AT_location attribute of the formal parameter.

And the frame base register is the register number "6", if I believe the
value of the DW_AT_frame attribute which is "reg6".

The mapping of DWARF register for the x86_64 abi says that register 6 is
%rbp.

I think that means that clang++ is saying that the argument of the
formal_parameter is being passed on the stack.

I haven't yet looked at what the assembly code is doing.

Looking at the DWARF emitted by GCC, it's saying something similar.  The
argument is being passed on the stack.

I'll look at what we have when libclang.so is compiled with
optimizations, and I'll look at the assembly too, before we jump to
conclusions.
Comment 3 Dodji Seketeli 2017-07-17 08:19:33 UTC
*** Bug 21772 has been marked as a duplicate of this bug. ***
Comment 4 Matthias Maennich 2020-11-12 14:30:26 UTC
This is somewhat a duplicate of 21492, still I keep it open separately. I move it to be an enhancement, though.