Build libabigail with both gcc and clang then compare the resulting library with abidiff $ abidiff g5/lib/libabigail.so.0.0.0 l5/lib/libabigail.so.0.0.0 | wc -l 5080 The output is quite long however, there appear to be some low hanging fruit which can easily be resolved. They appear similar to other errors when doing abidw --abidiff and I expect they have a similar proximate cause. A bunch of functions have completely unspecified subtype changes. This make tracking down the root cause of the problem exceedingly difficult and usually when their are unspecified subtype changes, it means that something in the comparison is different but it doesn't end up actually being an ABI artifact that matters. Here are some examples: [C] 'method abigail::ir::class_decl::base_spec_sptr abigail::comparison::base_diff::first_base() const' at abg-comparison.h:1801:1 has some indirect sub-type changes: [C] 'method const abigail::comparison::class_diff_sptr abigail::comparison::base_diff::get_underlying_class_diff() const' at abg-comparison.h:1807:1 has some indirect sub-type changes: [C] 'method virtual bool abigail::comparison::base_diff::has_changes() const' at abg-comparison.h:1816:1 has some indirect sub-type changes: [C] 'method virtual abigail::ir::change_kind abigail::comparison::base_diff::has_local_changes() const' at abg-comparison.h:1819:1 has some indirect sub-type changes: [C] 'method virtual void abigail::comparison::base_diff::report(std::ostream&, const std::string&) const' at abg-comparison.h:1822:1 has some indirect sub-type changes: [C] 'method abigail::ir::class_decl::base_spec_sptr abigail::comparison::base_diff::second_base() const' at abg-comparison.h:1804:1 has some indirect sub-type changes: This is was done with the latest libabigail trunk.
Created attachment 13812 [details] clang version 13.0.0 (Fedora 13.0.0-3.fc35) built object
I was not able to attach the GCC built object due to size but I stuffed it here for easy access: http://ssh.bencoyote.net/~ben/libabigail.so.0.0.0.gcc.bz2
The request is to either specify what the subtype changes are or if they do not matter for ABI purposes, make the types evaluate as the same.
The only meaningful difference that I can see in one case is "const" excerpt from the ABIXMLs: GCC <member-function access='public' const='yes' vtable-offset='6'> <function-decl name='has_changes' mangled-name='_ZNK7abigail10comparison10array_diff11has_changesEv' filepath='../../../../Work/libabigail/src/abg-comparison.cc' line='3578' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_ZNK7abigail10comparison10array_diff11has_changesEv'> <parameter type-id='type-id-1051' is-artificial='yes'/> <return type-id='type-id-7'/> </function-decl> </member-function> clang: <member-function access='public' vtable-offset='6'> <function-decl name='has_changes' mangled-name='_ZNK7abigail10comparison10array_diff11has_changesEv' filepath='/home/ben/Shared/test/libabigail-test/l5/../../../Work/libabigail/include/abg-comparison.h' line='1374' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_ZNK7abigail10comparison10array_diff11has_changesEv'> <parameter type-id='type-id-828' is-artificial='yes'/> <return type-id='type-id-2'/> </function-decl> </member-function> Here is another example: gcc: <member-function access='public' const='yes' vtable-offset='7'> <function-decl name='has_local_changes' mangled-name='_ZNK7abigail10comparison14type_diff_base17has_local_changesEv' filepath='/home/ben/Shared/test/libabigail-test/g5/../../../Work/libabigail/include/abg-comparison.h' line='1080' column='1' visibility='default' binding='global' size-in-bits='64'> <parameter type-id='type-id-1232' is-artificial='yes'/> <return type-id='type-id-8859'/> </function-decl> </member-function> clang: <member-function access='public' vtable-offset='7'> <function-decl name='has_local_changes' mangled-name='_ZNK7abigail10comparison14type_diff_base17has_local_changesEv' filepath='/home/ben/Shared/test/libabigail-test/l5/../../../Work/libabigail/include/abg-comparison.h' line='1080' column='1' visibility='default' binding='global' size-in-bits='64'> <parameter type-id='type-id-1004' is-artificial='yes'/> <return type-id='type-id-8200'/> </function-decl> </member-function> So arguably this is a clang bug in that it doesn't seem to record the const-ness of the function in the DWARF (or we have a parsing bug where we don't pick it up -- haven't checked that yet). I believe that we had a discussion a few years ago about whether const-ness of a function actually amounted to an ABI difference. I can't recall the exact ultimate decision on that point. At the very least, when the difference between the two functions is "const" then libabigail should print that out rather than emitting nothing.
(In reply to Ben Woodard from comment #4) > I believe that we had a discussion a few years ago about whether const-ness > of a function actually amounted to an ABI difference. I can't recall the > exact ultimate decision on that point. It absolutely is an ABI property. > At the very least, when the difference between the two functions is "const" > then libabigail should print that out rather than emitting nothing. That case shouldn't really be possible. If one of them isn't const, then the mangled name would change, and it would be a completely different symbol, and you'd never compare them in the first place. So writing code to handle the case where a const function is not const *should* be unnecessary.
The functions that I pointed out in the previous comment are all _ZNK functions. You can overload on the const'ness of a function and so it has to be encoded in the mangling of the linkage name. struct X { void foo(); void foo() const; }; We should try to get this fixed in clang so that it correctly encodes the const'ness of the function in the DWARF. However, we are going to need to deal with all the versions of clang which don't get this right. So I propose a fixup: if the function has the K attribute in its name mangling even if the function's DWARF is not marked const then the function should be labeled as const. List of needed actions: 1) file a bug with clang 2) when the difference between two functions is const'ness print that out don't just leave people with "...has some indirect sub-type changes:" then print nothing. 3) if the function mangle name has the K in it, then mark it const in the IR.
Created attachment 13832 [details] simple reproducer
Attached a very simple reproducer: [ben@alien libabigail-test]$ cat llvm-constness.C int a=0; struct X { void foo(); void foo() const; }; void X::foo(){ a++; } void X::foo() const { a--; } [ben@alien libabigail-test]$ glto/bin/abidiff llvm-constness.o.gcc llvm-constness.o.llvm Functions changes summary: 0 Removed, 1 Changed, 0 Added function Variables changes summary: 0 Removed, 0 Changed, 0 Added variable 1 function with some indirect sub-type change: [C] 'method void X::foo() const' at llvm-constness.C:5:1 has some indirect sub-type changes: Here is the DWARF breakdown: gcc: [ 72] subprogram abbrev: 10 external (flag_present) yes name (string) "foo" decl_file (data1) llvm-constness.C (1) decl_line (data1) 5 decl_column (data1) 8 linkage_name (strp) "_ZNK1X3fooEv" declaration (flag_present) yes object_pointer (ref4) [ 82] [ 82] formal_parameter abbrev: 2 type (ref4) [ 98] artificial (flag_present) yes [ 98] pointer_type abbrev: 3 byte_size (implicit_const) 8 type (ref4) [ 89] [ 89] const_type abbrev: 1 type (ref4) [ 4d] [ 4d] structure_type abbrev: 8 name (string) "X" byte_size (data1) 1 decl_file (data1) llvm-constness.C (1) decl_line (data1) 3 decl_column (data1) 8 sibling (ref4) [ 89] clang++: [ 60] subprogram abbrev: 5 linkage_name (strp) "_ZNK1X3fooEv" name (strp) "foo" decl_file (data1) llvm-constness.C (1) decl_line (data1) 5 declaration (flag_present) yes external (flag_present) yes [ 6b] formal_parameter abbrev: 6 type (ref4) [ 77] artificial (flag_present) yes [ 77] pointer_type abbrev: 7 type (ref4) [ 7c] [ 7c] const_type abbrev: 8 type (ref4) [ 46] [ 46] structure_type abbrev: 4 calling_convention (data1) pass_by_value (5) name (strp) "X" byte_size (data1) 1 decl_file (data1) llvm-constness.C (1) decl_line (data1) 3 Which looks pretty much the same and basically correct. I think that the first time I looked at it I mistook the implicit_const on GCC's byte_size as the thing that was different so there may not be a clang bug just a libabigail bug where for some reason it doesn't pick up the fact that the this pointer is const which seems to be the thing that indicates that the member function is const.
Created attachment 13837 [details] attachment-364165-0.html This may already have been fixed in current Clang. We noticed that we were getting conflicting duplicate type definitions in ABI XML. it turned out that Clang emits debug information for functions at some call sites, but it was an incomplete type skeleton, missing things like qualifiers. This was the fix: https://reviews.llvm.org/rG85f612efeb35
It appears not to be that Clang bug. On the other hand, I think there is some confusion going on here. The difference is in the const-ness of the this pointer. It is *not* the const-ness of the function (i.e., the constness of *this). Given that this itself cannot be modified, I'd say GCC is doing a better job here. abidiff --harmless /tmp/llcc.o.clang /tmp/llcc.o.gcc Functions changes summary: 0 Removed, 2 Changed, 0 Added functions Variables changes summary: 0 Removed, 0 Changed, 0 Added variable 2 functions with some indirect sub-type change: [C] 'method void X::foo()' at llcc.cc:8:1 has some indirect sub-type changes: implicit parameter 0 of type 'X*' changed: entity changed from 'X*' to 'X* const' type size hasn't changed [C] 'method void X::foo()' at llcc.cc:12:1 has some indirect sub-type changes: implicit parameter 0 of type 'const X*' changed: entity changed from 'const X*' to 'const X* const' type size hasn't changed
Abigail should just ignore that completely. The implicit parameter is not observable or accessible to user code, and there is no way to modify it whether or not it's "declared" as X* const or X*. That implicit parameter is not the `this` pointer, it's the _value_ of the `this` pointer, but that's not the same thing. `this` is a keyword that produces a prvalue of type "pointer to cv X" so it's always non-const (but unmodifiable, because it's a prvalue of fundamental type). The implicit parameter is what the compiler uses to provide the value of that prvalue, but is not actually the prvalue itself.
(In reply to gprocida from comment #10) > The difference is in the const-ness of the this pointer. It is *not* the > const-ness of the function (i.e., the constness of *this). If that's true, this is a very confusing way for abixml to express that: gcc: <member-function access='public' const='yes' vtable-offset='7'> clang: <member-function access='public' vtable-offset='7'> If the "const" attribute here refers to an unobservable and irrelevant detail of the implicit object parameter, why is it an attribute of the <member-function> element, not its parameter?