Giuliano Procida [Thu, 27 May 2021 08:53:04 +0000 (09:53 +0100)]
abg-reader: Ensure corpus always has a symtab reader
In the presence of an empty abi-corpus element and with the following
change to always allocate a fresh corpus object, such objects can
sometimes be left without a symtab reader, instead of inheritng one
from the previous corpus.
The reader is called to obtain sorted lists of symbols during ABI
comparisons. The simplest way to avoid a crash is to maintain the
invariant that a reader object is always present.
With this change, if there is bad XML preventing symbols from being
read, no error is raised as before, but the logic has been tweaked so
that abi-instr parsing will nevertheless be attempted.
* src/abg-reader.cc (read_symbol_db_from_input): Fix
documentation for this function. Allow "successful parsing" to
include the case where no symbols were present in the input.
(read_corpus_from_input): Unconditionally set a symtab reader
on the corpus object. Unconditionally parse the abi-instr of a
corpus.
Giuliano Procida [Thu, 27 May 2021 08:53:03 +0000 (09:53 +0100)]
dwarf-reader: Create new corpus unconditionally
The DWARF reader appears to create a new corpus object only if one is
not already present. However, the only case where there can be
multiple corpora is when build_corpus_group_from_kernel_dist_under is
called and this function clears down the reader context, including the
current corpus, between reading ELF objects.
So it's clearer to just create a fresh corpus object unconditionally
in the DWARF reader.
* src/abg-dwarf-reader.cc (read_debug_info_into_corpus):
Create new corpus object unconditionally.
abixml reader: Fix recursive type definition handling
This patch basically adjusts build_array_type_def to build the array
type early without trying to create the array element type first. The
array type is then registered, and then the array element type is
created. That way, if the element type indirectly needs the array
type being created, then it's going to be used. Then the element type
is set to the array once it's created.
The patch adjusts the code of the array type to allow creating the
array without element types and then setting the element type later.
* include/abg-ir.h (array_type_def::update_size): Declare new
private member function.
(array_type_def::array_type_def): Declare ...
* src/abg-ir.cc (array_type_def::array_type_def): ... a new
constructor that takes no element type.
(array_type_def::update_size): Define this helper private member
function.
(array_type_def::get_subrange_representation): Adjust for this to
work when there is no element type setup yet.
(array_type_def::{set_element_type, append_subranges}): Update the
size and name of the array.
* src/abg-reader.cc (build_array_type_def): Create the array type
before the element type so that the later can re-use the former.
Dodji Seketeli [Mon, 7 Jun 2021 14:07:50 +0000 (16:07 +0200)]
rhbz1951526 - SELF CHECK FAILED for 'gimp-2.10'
This is a fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=1951526.
Although it's a patch for one bug, it addresses several different
issues that cause the observed self comparison failure. As is often
the case on this kind of problems, the failure is difficult to
reproduce on a synthetic test case so I'll explain the root causes in
this commit log.
There are 4 different root causes to this problem. As I couldn't come
up with a reduced test case for each one of them I am adding the fixes
for those 4 issues in this commit, along with a new regression test
extracted from the initial bugzilla problem report.
So, overall, the symptom we are seeing here is that when we build an
IR for the input binary gimp-2.0, save that IR into abixml, and read
back that abixml into another IR, comparing the two IR shows changes;
it should show no change whatsoever. This is what we call in
libabigail jargon a self comparison (or self check) failure.
As alluded to in my introduction above, there appear to be 4 different
root causes for that self comparison failure.
1/ The first cause has to do with a situation about two anymous enums
that are (wrongly) considered different from an ABI point of view.
Using the debugging capabilities recently gained by libabigail, I
could notice that the two enums are:
Note how the second enum has a new enumerator named
'GIMP_INTERPOLATION_LANCZOS', but its value is '3', which is the exact
same value of as the one of the existing enumerator
GIMP_INTERPOLATION_NOHALO.
During type canonicalization of the IR from the input binary,
libabigail (wrongly) considers these two enums as being different.
This leads to the type 'Gimp*' (or anything type indirectly using any
one of the anonymous enums above) coming from one translation unit
being considered different from a type 'Gimp*' coming from another
translation unit, just because their are not using either one version
of the anonymous enum above or the other.
This leads to a *LOT* of spurious type changes from the first IR, that
are saved into abixml.
To fix this first problem, this patch introduces "two modes" of
comparing enums. There is a binary-only mode which only looks
enumerator values, not enumerator names. And then there is the
source-level mode which looks at both enumerator names and values when
comparing enums. The former mode is used during type
canonicalization. However, when a change is detected between two
enums, then the diff-IR built to describe the change is constructed
using the later mode. Using the later mode allows to describe
precisely things like enumerator insertion/removal by referring to the
names of the added/removed enumerators.
2/ The second root cause is that a struct, say, 'struct _GimpImage'
from a translation unit is considered different from a 'struct
_GimpImage' because the DWARF reader wrongly assign them different sizes.
Notice how the second 'struct _GimpImage' has a size of zero.
This is because when reading the DWARF, we first encounter the DIE for
the first' struct _GimpImage' and we properly build a type for it,
along with its declaration. Then when we encounter another DIE
defining 'struct _GimpImage' again, from a different translation unit,
the DWARF reader recognizes that it's a DIE for a declaration of
'struct _GimpImage' and fails to re-use the previous definition for
'struct _GimpImage'. So it wrongly builds declaration-only 'struct
_GimpImage' for it, hence the second struct _GimpImage with a zero
size.
Here again that creates spurious changes (after type canonicalization)
in types using struct _GimpImage. And that is a lot of types,
including things like 'Gimp*' and the like.
The fix for this root cause issue is to change
add_or_update_class_type in the DWARF reader to recognize that we are
seeing a type declaration for which there was already a definition and
return that definition instead of creating a new declaration.
3/ The third root cause is better explained with a "screen shot".
Consider these two 'versions' of the same struct _GdkDevice from two
different translation units:
GObject parent_instance;'
gchar* _g_sealed__name; // uses canonical type '@0x6892980'
GdkInputSource _g_sealed__source;'
GdkInputMode _g_sealed__mode;'
gboolean _g_sealed__has_cursor; // uses canonical type '@0x688dd00'
gint _g_sealed__num_axes; // uses canonical type '@0x688dd00'
GdkDeviceAxis* _g_sealed__axes;'
gint _g_sealed__num_keys; // uses canonical type '@0x688dd00'
GdkDeviceKey* _g_sealed__keys;'
};
$10 = (abigail::ir::type_base *) 0x7cd71e0
(gdb)
Notice how the name of the second data member 'name' was changed to
'_g_sealed_name'. A similar scheme happens to several other data
member names. The offsets and types of the struct _GdkDevice haven't
changed however. So from an ABI standpoint, the two versions of that
struct are equal. Libabigail consider them different however.
Because that type is used by tons of other types of the binary being
analyzed, this leads to lots of spurious canonical type difference
that shouldn't be there.
These three issues are magnified by the fact that the gimp binary is
compiled using "link time optimization". That brings in a lot more
opportunities to see these underlying issues that have been there for
a long time.
4/ The fourth and last root cause issue. When the abixml writer emits
a translation unit (TU), it keeps track of the 'non-emitted referred
to type' of the currently emitted translation unit and emits them at
the end of each TU. For instance, if the type 'Gimp*' (pointer to
Gimp) was emitted, and yet the referred-to type 'Gimp' wasn't emitted,
the TU writer makes sure to emit the referred-to 'Gimp' type at the
end of the TU. This has been going on for quite some time now.
The problem however is that although the non-emitted referred-to type
was referred to in this current TU, it might no have been *DEFINED* in
this TU. In that case, it should not be emitted in this TU.
Otherwise, the TU where that type is defined in the abixml might
appear different from where it is defined in the initial binary,
leading to self comparison failures down the road.
This patch ensures that a non-emitted referred-to type is always
emitted in the TU it belongs to.
5/ After doing all this, it appears that we were forgetting to emit
some function types that were defined in TUs emitted earlier and yet
were being referred-to later. Looking closer, I realized that we
should just emit function types seen in a given TU, regardless of the
referred-to relation.
The problem with that is that function types are special in libabigail
because there are two situation in which they are created.
Basically, a function type is created by the DWARF DIE
DW_TAG_subroutine_type. This is for instance how pointer to functions
are represented in DWARF, namely, by a DW_TAG_pointer_type that points
to a DW_TAG_subroutine_type. That is represented in the libabigail ir
by an instance of the abigail::ir::function_type type. This is
represented in abixml as a 'function-type' XML element.
But then, libabigail considers that all decls have a type. This
applies obviously for variables or data member. Right. But then,
libabigail considers that a function is also a decl, which has a type.
And the type of a function is a function type, represented by the same
abigail::ir::function_type. A practical difference with the former
situation is that function decls are *NOT* represented in abixml using
a 'function-type' element. Instead a 'function-decl' XML element uses
return type and parameter elements to represent the types involved
with a function decl.
Said otherwise, the former 'function type' concept used to represent
the type of functions in the libabigail IR is artificial. This
artificial-ness was not explicitly expressed in libabigail. This
patch now expresses that artificial-ness for function types. So the
abixml writer now just decide to not emit artificial function types,
and instead, emit all the non-artificial function types instead.
This addresses this last issues by being able to emit all
non-artificial function types defined in a given TU, without having to
bother with the fact that they are referred-to or not.
Together, fixing these 5 problems fixes this reported problem.
The changes to the reference test outputs are adjustments needed
because of the abixml output indeed changes.
* include/abg-ir.h
(environment::use_enum_binary_only_equality): Declare
accessors. (type_or_decl_base::{s,g}et_is_artificial):
Likewise. (decl_base::{s,g}et_is_artificial): Remove
accessors. * src/abg-ir.cc
(environment::priv::use_enum_binary_only_equality): Define new
data member.
(environment::priv::use_enum_binary_only_equality): Define
accessors. (type_or_decl_base::priv::is_artificial_): Define
new data member. It has actually moved here from
decl_base::priv::is_artificial_.
(type_or_decl_base::priv::priv): Initialize it.
(type_or_decl_base::{g,s}et_is_artificial): Define accessors.
(decl_base::is_artificial_): Move this to
type_or_decl_base::is_artificial_.
(maybe_adjust_canonical_type): In a given class of equivalence
of function types, if there is one non-artificial function
type, then the entire class of equivalence is considered
non-artificial; so flag the canonical function type as being
non-artificial. (is_enumerator_present_in_enum): Define new
static function. (equals): Re-arrange the overload for enums
so the order of the enumerators doesn't count in the
comparison. Also, two enums with different numbers of
enumerators can still be equal, with the right redundancy. In
the overload for var_decl, avoid taking into account the names
of data members in the comparison.
(enum_type_decl::enumerator::operator==): In the binary-level
comparison mode, only compare the value of enumerators, not
their name. * src/abg-comparison.cc (compute_diff): In the
overload for enum_type_decl, if the enums compare different
using binary-level comparison, then use source-level
comparison to build the diff-IR. * src/abg-dwarf-reader.cc
(read_context::compare_before_canonicalisation): Compare enums
using binary-level comparison. (add_or_update_class_type): If
we are looking at the definition of an existing declaration
that has been already defined then use the previous
definition, in case we are going to need to update the
definition. Also, update the size only if it's needed.
(build_function_type): By default, consider the newly built
function type as artificial. (build_ir_node_from_die): When
looking at a DW_TAG_subroutine_type DIE, consider the built
function type as non-artificial. * src/abg-reader.cc
(read_context::maybe_check_abixml_canonical_type_stability):
Don't consider declaration-only classes in an ODR context
because they don't have canonical types.
(build_function_decl): Flag the function type of the function
as artificial. (build_class_decl): Make sure to reuse class
types that were already created. * src/abg-writer.cc
(write_translation_unit): Allow emitting empty classes. Make
sure referenced types are emitting in the translation unit
where they belong. Avoid emitting artificial function types.
*
tests/data/test-alt-dwarf-file/rhbz1951526/rhbz1951526-report-0.txt:
New test reference output. *
tests/data/test-alt-dwarf-file/rhbz1951526/usr/bin/gimp-2.10:
New reference test binary input. *
tests/data/test-alt-dwarf-file/rhbz1951526/usr/lib/debug/.dwz/gimp-2.10.22-2.el9.1.aarch64:
Likewise. *
tests/data/test-alt-dwarf-file/rhbz1951526/usr/lib/debug/usr/bin/gimp-2.10-2.10.22-2.el9.1.aarch64.debug:
Likewise. * tests/data/Makefile.am: Add the new test files to
source directory. * tests/test-alt-dwarf-file.cc: Add the new
test inputs to this test harness. *
tests/data/test-abidiff/test-PR18791-report0.txt: Adjust. *
tests/data/test-abidiff/test-enum0-report.txt: Likewise. *
tests/data/test-annotate/libtest23.so.abi: Likewise. *
tests/data/test-annotate/libtest24-drop-fns-2.so.abi:
Likewise. *
tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test-anonymous-members-0.o.abi:
Likewise. * tests/data/test-annotate/test13-pr18894.so.abi:
Likewise. * tests/data/test-annotate/test14-pr18893.so.abi:
Likewise. * tests/data/test-annotate/test15-pr18892.so.abi:
Likewise. * tests/data/test-annotate/test17-pr19027.so.abi:
Likewise. *
tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise. *
tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise. *
tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise. * tests/data/test-annotate/test21-pr19092.so.abi:
Likewise. *
tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1.abi:
Likewise. *
tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt:
Likewise. * tests/data/test-diff-dwarf/test6-report.txt:
Likewise. *
tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt:
Likewise. *
tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt:
Likewise. * tests/data/test-diff-filter/test8-report.txt:
Likewise. *
tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt:
Likewise. *
tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Likewise. *
tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt:
Likewise. *
tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt:
Likewise. *
tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi:
Likewise. *
tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Likewise.
*
tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Likewise. *
tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise. *
tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi:
Likewise. *
tests/data/test-read-dwarf/libtest24-drop-fns.so.abi:
Likewise. *
tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi:
Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi:
Likewise. * tests/data/test-read-dwarf/test13-pr18894.so.abi:
Likewise. * tests/data/test-read-dwarf/test14-pr18893.so.abi:
Likewise. * tests/data/test-read-dwarf/test15-pr18892.so.abi:
Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi:
Likewise. * tests/data/test-read-dwarf/test17-pr19027.so.abi:
Likewise. *
tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise. *
tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise. *
tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise. * tests/data/test-read-dwarf/test21-pr19092.so.abi:
Likewise. *
tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise. *
tests/data/test-read-dwarf/test9-pr18818-clang.so.abi:
Likewise. *
tests/data/test-read-write/test28-without-std-fns-ref.xml:
Likewise. *
tests/data/test-read-write/test28-without-std-vars-ref.xml:
Likewise.
Dodji Seketeli [Mon, 7 Jun 2021 12:33:53 +0000 (14:33 +0200)]
reader: Canonicalizing a type once is enough
While looking at something else, I noticed that the abixml reader was
trying to canonicalize each type twice. Once should be enough.
* src/abg-reader.cc (build_type): Don't try to canonicalize the
type here because all the sub-routines of this function (which
actually build the type) already try to canonicalize it.
Dodji Seketeli [Mon, 7 Jun 2021 12:29:10 +0000 (14:29 +0200)]
ir: make 'debug(artefact)' support showing enums
While debugging something else, I realized that 'debug(artifact)'
couldn't show the enumerators of an enum. I also realized that we were
not showing the 'declaration-only-ness' of the artefact either. This
patch fixes that.
* src/abg-ir.cc (get_debug_representation): Add support for
showing details for enums. Also show declaration-only-ness for
class or unions.
When debugging self comparison issues, once the abixml file is read
back into memory, I often want to get the type-id of an artifact that
was read from abixml or get the canonical type of an artifact which
type-id is known.
Part of that information is indirectly present in the data member
abigail::reader::reader_context::m_pointer_type_id_map after the
.typeid file is loaded from file into memory. The problem is that the
instance of abigail::reader::reader_context is transient as it's
destroyed quickly after the abixml file is read. We want it to stay
alive longer. So this patch moves that data member into
abigail::environment instead, along with its accessors. The patch
then adds the new member functions
environment::{get_type_id_from_pointer,get_canonical_type_from_type_id}
to get the type-id of an artifact de-serialized from abixml and the
canonical type of an artifact for which we now the type-id string.
* include/abg-ir.h (environment::{get_pointer_type_id_map,
get_type_id_from_pointer, get_canonical_type_from_type_id}):
Declare new member functions.
* src/abg-ir.cc (environment::{get_pointer_type_id_map,
get_type_id_from_pointer, get_canonical_type_from_type_id}):
Define member functions.
(environment::priv::pointer_type_id_map_): Move
this data member here from ...
* src/abg-reader.cc (read_context::m_pointer_type_id_map):
... here.
(read_context::get_pointer_type_id_map): Remove this as it's now
defined in environment::get_pointer_type_id_map.
(read_context::maybe_check_abixml_canonical_type_stability):
Adjust.
(build_type): Likewise.
Dodji Seketeli [Mon, 7 Jun 2021 07:05:05 +0000 (09:05 +0200)]
ir: Enable setting breakpoint on first type inequality
When debugging type canonicalization in
type_base::get_canonical_type_for, I more often than not want to know
why a type compares different to another. Until now, I've been doing
that by stepping in the debugger. I figure a much efficient way of
doing that is to be able to set a breakpoint on the first occurrence
of type inequality.
To do that, I am adding a few macros to use in the 'equals' functions
to return their value: ABG_RETURN(value), ABG_RETURN_REQUAL(l,r) and
ABG_RETURN_FALSE. Those invoke a new function called
'notify_equality_failed' when the result of the comparison is false.
This allows to just set a debugger breakpoint on
'notify_equality_failed' to know when and why the type comparison
fails.
These macros invoke notify_equality_failed only if the
WITH_DEBUG_SELF_COMPARISON macro is defined. Otherwise, they do what
the code was doing previously. Said otherwise, this whole shebang is
enabled only when the code is configured with
--enable-debug-self-comparison.
This patch incurs no functional change.
* src/abg-ir.cc (notify_equality_failed): Define new static
function if WITH_DEBUG_SELF_COMPARISON is defined.
(ABG_RETURN_EQUAL, ABG_RETURN_FALSE, ABG_RETURN): Define new macros.
(try_canonical_compare): Use ABG_RETURN_EQUAL rather than just
returning the result of a comparison.
(equals): In all the overloads, use the new ABG_RETURN* macros,
rather than just returning boolean values.
abixml reader: Fix recursive type definition handling
After that patch, I noticed that qualified and reference types also
need to be able to handle the case where their underlying/pointed-to
type recursively refers to the type being created. Just like typedef
and pointer types in that patch.
This patch thus adjusts build_qualified_type_decl and
build_reference_type_def to support that. It also adjusts the
qualified_type_def and reference_type_def classes to support being
created without underlying/pointed-to type initially.
* include/abg-ir.h (qualified_type_def::qualified_type_def):
Declare a constructor with no underlying type.
(reference_type_def::reference_type_def): Declare a constructor
with no pointed-to type.
(reference_type_def::set_pointed_to_type): Declare new method.
* src/abg-ir.cc (qualified_type_def::priv::priv): Define a
constructor that takes no underlying type.
(qualified_type_def::build_name): Make this work even on
incomplete types with no underlying type. In that case, this
behaves like if the underlying type is "void".
(qualified_type_def::qualified_type_def): Define a constructor
that takes no underlying type.
(qualified_type_def::get_size_in_bits): Make this work on
incomplete types with no underlying type.
(qualified_type_def::set_underlying_type): Adjust to properly
update this type when a new underlying type is set. Particularly,
its name and the lookup maps from the type scope.
(reference_type_def::reference_type_def): Define a constructor
that takes no pointed-to type.
(reference_type_def::set_pointed_to_type): Define new function.
* src/abg-reader.cc (build_qualified_type_decl): Construct the
qualified type early before we try to construct its underlying
type. Associate this incomplete type with the type-id. Then try
to construct the underlying type. During its construction, if
this incomplete qualified type is needed due to recursion then it
can be used, leading to just one qualified type being used as it
should be.
(build_reference_type_def): Likewise for building incomplete
reference type first before its pointed-to type.
Dodji Seketeli [Fri, 21 May 2021 21:55:44 +0000 (23:55 +0200)]
abixml reader: Fix recursive type definition handling
This should fix self comparison bug https://bugzilla.redhat.com/show_bug.cgi?id=1944088
This arose from a self comparison check failing on the library
libgvpr.so.2 from the graphviz-2.44.0-17.el9.aarch64.rpm package.
Now that we have facilities to see what type (instantiated from the
abixml representation of the libgvpr.so library) exactly the
canonicalization process is failing for, I decided to use it ;-)
I extracted the package and its associating debug info into a
directory named 'extract' and ran abidw --debug-abidiff on it:
error: problem detected with type 'typedef Vmalloc_t' from second corpus
error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-170' changed from 'd72618' to '14a7448'
error: problem detected with type 'Vmalloc_t*' from second corpus
error: canonical type for type 'Vmalloc_t*' of type-id 'type-id-188' changed from 'd72ba8' to '14a7968'
[...]
This tells me that "typedef Vmalloc_t", created from the abixml
compares different from its originating peer that was created from the
binary directly. The same goes for the pointer type "Vmalloc_t*", etc.
Using the new debugging/logging functionalities from the command line
of the debugger, I could see that in the abixml reader,
build_typedef_decl can fail subtly when the underlying type of the
typedef refers to the typedef itself. In that case, we need to ensure
that the typedef created by build_typedef_decl is the same one that is
used by the underlying type. which is not the case at the moment. At
the moment, the underlying type would create a new typedef beside the
one currently being created by build_typedef_decl. That leads to more
than one typedef in the system to designate "typedef Vmalloc_t". And
that wreaks havoc later down the road.
This patch arranges so that build_typedef_decl creates the typedef
"early" before the underlying type is created. That typedef
temporarily has no underlying type. It's registered as being the
typedef for the type-id string that identifies it in the abixml. And
then the function goes to create the underlying type. This
arrangement ensures that if the underlying type refers to the typedef
being created (via its type-id string), then the typedef that was
created early is effectively re-used. This ensures that a typedef
which recursively refer to itself is properly represented. It's only
when the underlying type is fully created that it's added to the
typedef.
Something similar is done for pointer types, in
build_pointer_type_def.
Note that to do this, the patch adjusts the typedef_decl and
pointer_type_def classes so that they can be created with no
underlying/pointed-to types. The underlying/pointed-to type can thus
be added later.
I believe this patch is the minimal patch necessary to fix this issue.
The graphviz RPM is added to the regression test suite for good
measure.
After visual inspection, I realized that there are other types besides
typedef and pointer types that exhibit the same class of problem even
if they are not involved in this issue on this particular binary. A
subsequent patch is going to address the problem for those types,
namely, qualified and reference types.
* include/abg-ir.h (pointer_type_def::pointer_type_def): Declare a
constructor with no pointed-to type.
(pointer_type_def::set_pointed_to_type): Declare new method.
(typedef_decl::typedef_decl): Declare a constructor with no
underlying type.
* src/abg-ir.cc (pointer_type_def::pointer_type_def): Define a
constructor with no pointed-to type. The pointed-to type can thus
later be set when it becomes available.
(pointer_type_def::set_pointed_to_type): Define new method.
(pointer_type_def::get_qualified_name): Make this work on a
pointer type that (momentarily) has no pointed-to type.
(typedef_decl::typedef_decl): Define a constructor with no
underlying type.
(typedef_decl::get_size_in_bits): Make this work on a typedef that
has (momentarily) no underlying type.
(typedef_decl::set_underlying_type): Update the size and alignment
of the typedef from its new underlying type.
* src/abg-reader.cc (build_pointer_type_def): Construct the
pointer type early /BEFORE/ we even try to construct its
pointed-to type. Associate this incomplete type with the type-id.
Then try to construct the pointed-to type. During the
construction of the pointed-to type, if this pointer is needed
(due to recursion) then the incomplete pointer type can be used,
leading to just one pointer type used (recursively) as it should
be.
(build_typedef_decl): Likewise for building typedef type early
without its underlying type so that it can used by the underlying
type if needed.
* tests/data/test-diff-pkg/graphviz-2.44.0-18.el9.aarch64-self-check-report-0.txt:
New test reference output.
* tests/data/test-diff-pkg/graphviz-2.44.0-18.el9.aarch64.rpm: New
binary test input.
* tests/data/test-diff-pkg/graphviz-debuginfo-2.44.0-18.el9.aarch64.rpm: Likewise.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-pkg.cc (in_out_specs): Add the test inputs above
to this test harness.
Dodji Seketeli [Thu, 20 May 2021 10:56:27 +0000 (12:56 +0200)]
Introduce artificial locations
When an abixml file is "read in" and the resulting in-memory internal
representation is saved back into abixml, the saved result can often
differ from the initial input in a non deterministic manner. That
read-write instability is non-desirable because it generates
unnecessary changes that cloud our ability to build reliable
regression tests, among other things. Also, that unnecessarily
increases the changes to the existing regression test reference
outputs leading to a lot more churn than necessary.
This patch tries to minimize that abixml read-write instability in
preparation of patches that would otherwise cause too much churn in
reference output files of the regression test suite.
The main reason why this read-write instability occurs is that a lot
of type definitions don't have source location.
For instance, all the types that are not user defined fall into that
category. Those types can't be topologically sorted by using their
location as a sorting criteria. Instead, we are currently using the
order in which those location-less types are processed by the reader
as the output (i.e, write time) order. The problem with that approach
is that the processing order can be dependant on the other of which
OTHER TYPES likes class types are processed. And that order can be
changed by patches in the future. That in and of itself shouldn't
change the write order of these types.
For instance, if a class Foo has data members and member functions
whose types are non-user-defined types, then, the order in which those
data members are processed can possibly determine the order in which
those non-user-defined are processed.
This patch thus introduces the concept of artificial location.
A *NON-ARTIFICIAL* location is a source location that was emitted by
the original emitter of the type meta-data. In the case of DWARF type
meta-data, the compiler originally emitted source location. That
location when read is considered non-artificial, or natural, if you
prefer.
In the case of abixml however, an artificial location would be the
source location at which an XML element is encountered.
For instance, consider the abixml file below "path/to/exmaple.abi" below:
At line 3 of that file, the non-user defined type name "bool" is
defined using the XML element "type-decl". Note how that element
lacks the "filepath", "line" and "column" attributes that would collectively
define the source location of that type. So this type "bool" don't
carry any natural location.
The abixml reader can however generate an artificial location for it.
That the filepath of that artificial location would thus be the path
to that ABI corpus, i.e, "path/to/example.abi". The line number would
be 3. The column would be left to zero.
That artificial location will never be explicitly be written down as
an XML attribute as it can always be implicitly retrieved by
construction.
The patch changes the internal representation so that each ABI
artifact of the internal representation can now carry both an
artificial and a natural location.
When two artifacts have an artificial location, then its used to
topologically sort them. The one that is defined topologically
"earlier" obviously comes first.
When two artifacts have a natural location then its used to
topologically sort them.
Otherwise, they are sorted lexicographically.
This makes the output of abilint a lot more read-write stable.
* include/abg-fwd.h (get_artificial_or_natural_location): Declare
new function.
* include/abg-ir.h (location::location): Initialize & copy ...
(location::is_artificial_): ... a new data member.
(location::{g,s}et_is_artificial): New accessors.
(location::{operator=}): Adjust.
(type_or_decl_base::{set,get,has}_artificial_location): Declare
new member functions.
* src/abg-ir.cc (decl_topo_comp::operator()): In the overload for
decl_base*, use artificial location for topological sort in
priority. Otherwise, use natural location. Otherwise, sort
lexicographically.
(type_topo_comp::operator()): In the overload for type_base*, use
lexicographical sort only for types that don't have location at
all.
(type_or_decl_base::priv::artificial_location_): Define new data
member.
(type_or_decl_base::{set,get,has}_artificial_location): Define new
member functions.
(decl_base::priv): Allow a constructor without location. That one
sets no natural location to the artifact.
(decl_base::decl_base): Use decl_base::set_location in the
constructor now.
(decl_base::set_location): Adjust this to support setting a
natural or an artificial location.
(get_debug_representation): Emit debugging log showing the
location of an artifact, using its artificial location in
priority.
(get_natural_or_artificial_location): Define new function.
* src/abg-reader.cc (read_artificial_location)
(maybe_set_artificial_location): Define new static functions.
(read_location): Read artificial location when no natural location
was found.
(build_namespace_decl, build_function_decl, build_type_decl)
(build_qualified_type_decl, build_pointer_type_def)
(build_reference_type_def, build_subrange_type)
(build_array_type_def, build_enum_type_decl, build_typedef_decl)
(build_class_decl, build_union_decl, build_function_tdecl)
(build_class_tdecl, build_type_tparameter)
(build_non_type_tparameter, build_template_tparameter): Read and
set artificial location.
* src/abg-writer.cc (write_location): Don't serialize artificial
locations.
(write_namespace_decl): Topologically sort member declarations
before serializing them.
* tests/data/test-read-write/test28-without-std-fns-ref.xml:
Adjust.
* tests/data/test-read-write/test28-without-std-vars-ref.xml:
Likewise.
* tests/data/test-annotate/libtest23.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test0.abi: Likewise.
* tests/data/test-annotate/test13-pr18894.so.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Likewise.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Likewise.
* tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise.
* tests/data/test-read-dwarf/test-suppressed-alias.o.abi: Likewise.
* tests/data/test-read-dwarf/test0.abi: Likewise.
* tests/data/test-read-dwarf/test0.hash.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
* tests/data/test-read-write/test28-without-std-fns-ref.xml:
Likewise.
* tests/data/test-read-write/test28-without-std-vars-ref.xml:
Likewise.
Dodji Seketeli [Wed, 19 May 2021 16:22:18 +0000 (18:22 +0200)]
Detect abixml canonical type instability during abidw --debug-abidiff
In the debugging mode of self comparison induced by the invocation of
"abidw --debug-abidiff <binary>", it's useful to be able to ensure the
following invariant:
The pointer value of the canonical type of a type T that is
serialized into abixml with the id string "type-id-12" (for
instance) must keep the same canonical type pointer value when
that abixml file is de-serialized back into memory. This is
possible mainly because libabigail stays loaded in memory all the
time during both serialization and de-serialization.
This patch adds support for detecting when that invariant is not
respected.
In other words it detects when the type build from de-serializing the
type which id is "type-id-12" (for instance) has a canonical type
which pointer value is different from the pointer value of the
canonical type (of the type) that was serialized as having the type id
"type-id-12".
This is done in three phases.
The first phase happens in the code of abidw itself; after the abixml
is written on disk, another file called the "typeid file" is written
on disk as well. That later file contains a set of records; each
record associates a "type id string" (like the type IDs that appear in
the abixml file) to the pointer value of the canonical type that
matches that type id string. That file is thus now available for
manual inspection during a later debugger session. This is done by
invoking the new function write_canonical_type_ids.
The second phase appears right before abixml loading time. The typeid
file is read back and the association "type-id string" <-> is stored
in a hash map that is returned by
environment::get_type_id_canonical_type_map(). This is done by
invoking the new function load_canonical_type_ids.
The third phase happens right after the canonicalization (triggered in
the abixml reader) of a type coming from abixml, corresponding to a
given type id. It checks if the pointer value of the canonicalization
type just computed is the same as the one associated to that type id
in the map returned by environment::get_type_id_canonical_type_map.
This is a way of verifying the "stability" of a canonical type during
its serialization and de-serialization to and from abixml and it's
done as part of "abidw --debug-abidiff <binary>".
Just as an example, here is the kind of error output that I am getting
on a real life debugging session on a binary that exhibits self
comparison error:
$ abidw --debug-abidiff -d <some-binary>
error: problem detected with type 'typedef Vmalloc_t' from second corpus
error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-179' changed from '1a083e8' to '21369b8'
[...]
$
From this output, I see that the first type for which libabigail
exhibits an instability on the pointer value of the canonical type is
the type 'typedef Vmalloc_t'. In other words, when that type is saved
to abixml, the type we read back is different. This needs further
debugging but at least it pinpoints exactly what type we are seeing
the core issue on first. This is of a tremendous help in the root
cause analysis needed to understand why the self comparison is
failing.
* include/abg-ir.h (environment::get_type_id_canonical_type_map):
Declare new data member.
* src/abg-ir.cc (environment::priv::type_id_canonical_type_map_):
Define new data member.
(environment::get_type_id_canonical_type_map): Define new method.
* include/abg-reader.h (load_canonical_type_ids): Declare new
function.
* src/abg-reader.cc (read_context::m_pointer_type_id_map):
Define new data member.
(read_context::{get_pointer_type_id_map,
maybe_check_abixml_canonical_type_stability}): Define new methods.
(read_context::{maybe_canonicalize_type,
perform_late_type_canonicalizing}): Invoke
maybe_perform_self_comparison_canonical_type_check after
canonicalization to perform canonicalization type stability
checking.
(build_type): Associate the pointer value for the newly built type
with the type id string identifying it in the abixml. Once the
abixml representation is dropped from memory and we are about to
perform type canonicalization, we can still know what the type id
of a given type coming from abixml was; it's thus possible to
verify that the canonical type associated to that type id is the
same as the one stored in the typeid file.
(read_type_id_string): Define new static function.
(load_canonical_type_ids): Define new function.
* include/abg-writer.h (write_canonical_type_ids): Likewise.
* src/abg-writer.cc (write_canonical_type_ids): Define new
function overloads.
* tools/abidw.cc (options::type_id_file_path): New data member.
(load_corpus_and_write_abixml): Write and read back the typeid
file.
Dodji Seketeli [Tue, 18 May 2021 08:29:38 +0000 (10:29 +0200)]
Detect failed self comparison in type canonicalization of abixml
During the self comparison triggered by "abidw --abidiff <binary>",
some comparison errors can happen when canonicalizing types that are
"de-serialized" from the abixml that was serialized from the input
binary.
This patch adds some debugging checks and messaging to emit a message
when a type from the abixml appears to not "match" the original type
from the initial corpus it originated from.
This is the more detailed description:
Let's consider a type T coming from the corpus of the input binary.
That input corpus is serialized into abixml and de-serialized again
into a second corpus that we shall name the abixml corpus. From that
second corpus, let's consider the type T' that is the result of
serializing T into abixml and de-serializing it again. T is said to
be the original type of T'. If T is a canonical type, then T' should
equal T. Otherwise, if T is not a canonical type, its canonical type
should equal the canonical type of T'.
For the sake of simplicity, let's consider that T is a canonical
type. During the canonicalization of T', T' should equal T. Each and
every canonical type coming from the abixml corpus should be equal to its
original type from the binary corpus.
If a T' is different from its original type T, then there is an
"equality problem" between T and T'. In other words, there is a
mismatch between T and T'. We want to be notified of that problem so
that we can debug it further and fix it.
So this patch introduces the option "abidw --debug-abidiff <binary>"
to trigger the "debug self comparison mode". At canonicalization
time, we detect that we are in that debug self comparison mode and
during canonicalization of types from the abixml corpus, it detects
when they compare different from their counterpart from the original
corpus.
This debugging capability can be enabled at configure time with a new
--enable-debug-self-comparison configure option. That option defines
a new WITH_DEBUG_SELF_COMPARISON compile time macro that is used to
conditionally compile the implementation of this debugging feature.
So, one example of this might look like this:
abidw --debug-abidiff bin:
error: problem detected with type 'typedef Vmalloc_t' from second corpus
error: problem detected with type 'Vmalloc_t*' from second corpus
[...]
So that means the "typedef Vmalloc_t" read from the abixml compares
different from its original type where it should not.
So armed with this new insight, I know I need to debug that comparison
in particular to see why it wrongly results in two different types.
* doc/manuals/abidw.rst: Add documentation for the --debug-abidiff
option.
* include/abg-ir.h (environment::{set_self_comparison_debug_input,
get_self_comparison_debug_inputs, self_comparison_debug_is_on}):
Declare new methods.
* configure.ac: Define a new --enable-debug-self-comparison option
that is disabled by default. That option defines a new
WITH_DEBUG_SELF_COMPARISON preprocessor macro.
* src/abg-ir.cc
(environment::priv::{first_self_comparison_corpus_,
second_self_comparison_corpus_, self_comparison_debug_on_}): New
data members. Also, re-indent the data members.
(environment::{set_self_comparison_debug_input,
get_self_comparison_debug_inputs, self_comparison_debug_is_on}):
Define new method.
(type_base::get_canonical_type_for): In the "debug self comparison
mode", if a type coming from the second corpus compares different
from its counterpart coming from the first corpus then log a debug
message.
* src/abg-dwarf-reader.cc (read_debug_info_into_corpus): When
loading the first corpus, if the debug self comparison mode is on,
then save that corpus on the side in the environment.
* src/abg-reader.cc (read_corpus_from_input): When loading the
second corpus, if the debug self comparison mode is on, then save
that corpus on the side in the environment.
* tools/abidw.cc: Include the config.h file for preprocessor
macros defined at configure
(options::debug_abidiff): New data member.
(parse_command_line): Parse the --debug-abidiff option.
(load_corpus_and_write_abixml): Switch the self debug mode on when
the --debug-abidiff option is provided. Use a read_context for
the abixml loading. That is going to be useful for subsequent
patches.
Dodji Seketeli [Mon, 17 May 2021 13:23:02 +0000 (15:23 +0200)]
Add primitives callable from the command line of the debugger
During debugging it can be extremely useful to be able to visualize
the data members of a class type, instance of
abigail::ir::class_decl*. It's actually useful to visualize the
pretty representation (type name and kind) of all types and decls that
inherit abigail::ir::type_or_decl_base, basically.
Today, in the debugger, if we have a variable defined as
"abigail::ir::type_or_decl_base* t", we can type:
$ p t->get_pretty_representation(true, true);
This would display something like:
$ typedef foo_t
However, if 't' is declared as:
"abigail::ir::class_decl* t", then if we type:
(gdb) p t->get_pretty_representation(true, true);
We'll get something like:
class foo_klass
(gdb)
So we get the kind and the name of the ABI artifact; but in case of a
class, we don't get the details of its data members.
This patch introduces a function named "debug" which, would be invoked
on the 't' above like this:
int tm_sec; // uses canonical type '@0x538270'
int tm_min; // uses canonical type '@0x538270'
int tm_hour; // uses canonical type '@0x538270'
int tm_mday; // uses canonical type '@0x538270'
int tm_mon; // uses canonical type '@0x538270'
int tm_year; // uses canonical type '@0x538270'
int tm_wday; // uses canonical type '@0x538270'
int tm_yday; // uses canonical type '@0x538270'
int tm_isdst; // uses canonical type '@0x538270'
long int tm_gmtoff; // uses canonical type '@0x461200'
const char* tm_zone; // uses canonical type '@0x544528'
};
(gdb)
This gives much more information to understand what 't' designates.
The patch also provides functions to retrieve one data member from a
given type that happens to designate a class type. For instance:
(gdb) p debug(get_data_member(t, "tm_sec")._M_ptr)
int tm::tm_sec
(gdb)
The patch also provides a new 'debug_equals' function that allow us to
easily perform an artifact comparison from the command line of the
debugger, as well as methods to the environment type to poke at the
canonical types available in the environment.
These new debugging primitives already proved priceless while
debugging issues that are fixed by subsequent patches to come.
* include/abg-fwd.h (get_debug_representation, get_data_member)
(debug, debug_equals): Declare new functions.
* include/abg-ir.h (environment{get_canonical_types,
get_canonical_type}): Declare new member functions.
* src/abg-ir.cc (environment::{get_canonical_types,
get_canonical_type}): Define new member functions.
(get_debug_representation, get_data_member)
(debug, debug_equals): Define new functions.
Dodji Seketeli [Mon, 24 May 2021 14:29:36 +0000 (16:29 +0200)]
Peel array types when peeling pointers from a type
In peel_typedef_pointer_or_reference_type, we want to peel typedefs
and pointer types (in general) from a given type. We need to peel
array types as well, as those are conceptually a pointer-like type as
well.
This patch does that.
* src/abg-ir.cc (peel_typedef_pointer_or_reference_type): In the
overloads for type_base_sptr and type_base*, peel array type off
as well.
Dodji Seketeli [Mon, 24 May 2021 14:23:18 +0000 (16:23 +0200)]
Fix DWARF type DIE canonicalization
While looking at something else, I noticed that the DWARF type DIE
canonicalization code wasn't taking the type of array elements into
account when comparing arrays.
This patch fixes that.
* src/abg-dwarf-reader.cc (compare_dies): When comparing array
type DIEs, take into account the type of the elements of the
arrays.
reader: Use xmlFirstElementChild/xmlNextElementSibling to iterate over children elements
Use xmlFirstElementChild/xmlNextElementSibling to iterate over element
children nodes rather than doing it by hand in the various for loops.
* src/abg-reader.cc (walk_xml_node_to_map_type_ids)
(read_translation_unit, read_translation_unit_from_input)
(read_symbol_db_from_input, build_needed)
(read_elf_needed_from_input, read_corpus_group_from_input)
(build_namespace_decl, build_elf_symbol_db, build_function_decl)
(build_function_type, build_array_type_def, build_enum_type_decl)
(build_class_decl, build_union_decl, build_function_tdecl)
(build_class_tdecl, build_type_composition)
(build_template_tparameter): Use
xmlFirstElementChild/xmlNextElementSibling rather than poking at
xmlNode::children and looping over xmlNode::next by hand.
reader: Use xmlFirstElementChild and xmlNextElementSibling rather than xml::advance_to_next_sibling_element
The xml::advance_to_next_sibling_element is redundant with the
xmlNextElementSibling API of libxml. Similarly, xmlFirstElementChild
is redundant with using xml::advance_to_next_sibling_element on the
xmlNode::children data member. Let's use the libxml API instead.
* include/abg-libxml-utils.h (advance_to_next_sibling_element):
Remove the declaration of this function.
* src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay)
(advance_to_next_sibling_element): Remove definitions of these functions.
* src/abg-reader.cc (read_translation_unit_from_input)
(read_elf_needed_from_input, read_corpus_group_from_input): Use xmlNextElementSibling instead
of xml::advance_to_next_sibling_element.
(read_corpus_from_input): Likewise. Also, use
xmlFirstElementChild instead of
xml::advance_to_next_sibling_element on the xmlNode::children data
member.
(read_corpus_group_from_input): use xmlFirstElementChild instead
of xml::advance_to_next_sibling_element on the xmlNode::children
data member.
reader: Handle 'abi-corpus' element being possibly empty
This problem was reported at https://sourceware.org/bugzilla/show_bug.cgi?id=27616.
The abixml reader wrongly assumes that the 'abi-corpus' element is
always non-empty. Note that until now, the only emitter of abixml
consumed in practice was abg-writer.cc and it only emits non-empty
'abi-corpus' elements. So the issue wasn't exposed.
So, the reader assumes that an 'abi-corpus' element has at least a
text node.
For instance, consider this minimal input file named test-v0.abi:
if (!node->children) // <---- we get out early here and we
return nil; // forget about the properties of
// the current empty corpus element node
So, at its core, fixing the issue at hand involves avoiding the early
return there.
But then, it turns out that's not enough.
In the current setting, the different abixml processing entry points
are designed to be used in a semi "streaming" mode.
So for instance, read_translation_unit_from_input can be invoked
repeatedly to "stream in" the next translation unit at each
invocation.
Alternatively, the lower level xmlTextReaderNext can be used to
iterate over XML node until we reach the translation unit XML element
we are interested in. At that point xmlTextReaderExpand can be used
to expand the XML node, then we let the context know that this is
the current node of the corpus that needs to be processed, using
read_context::get_corpus_node. Once we've done that,
read_translation_unit_from_input can be called to process that
particular corpus node. Note that the corpus node at hand, that needs
to be processed will be retrieved by read_context::get_corpus_node.
These two modes of operation are also available for
read_corpus_from_input, read_symbol_db_from_input,
read_elf_needed_from_input etc.
Today, these functions all assume that the current node returned by
read_context::get_corpus_node is the node /before/ the node of the
corpus to be processed. So they all start looking at the /next sibling/
of the node returned by read_context::get_corpus_node. So the code
was implicitly assuming that read_context::get_corpus_node was
pointing to a text node that was before the node of the corpus that we
want to process.
This is wrong. read_context::get_corpus_node should just return the
current node of the corpus that needs to be processed and voila.
And so read_context::set_corpus_node should be used to set the current
node of the corpus to the current element node that needs to be processed.
That's the spirit of the change done by this patch.
As its name suggests, the existing
xml::advance_to_next_sibling_element is used to skip non element xml
nodes (including text nodes) and move to the next element node to
process, which is set to the context using
read_context::set_corpus_node.
Then the actual processing functions like read_corpus_from_input get
the node to process, using read_context::get_corpus_node and process
it rather than processing the sibling node that comes after it.
The other changes are either to prevent related crashes that I noticed
while doing various tests, update the abilint tool used to read and
debug abixml input files and add better documentation.
* src/abg-reader.cc (read_context::get_corpus_node): Add comment
to this member function.
(read_translation_unit_from_input, read_symbol_db_from_input)
(read_elf_needed_from_input): Start processing the current node of
the corpus that needs to be processed rather than its next
sibling. Once the processing is done, set the new "current node
of the corpus to be processed" properly by skipping to the next
element node to be processed.
(read_corpus_from_input): Don't get out early when the
'abi-corpus' element is empty. If, however, it has children node,
skip to the first child element and flag it -- using
read_context::set_corpus_node -- as being the element node to be
processed by the processing facilities of the reader. If we are
in a mode where we called xmlTextReaderExpand ourselves to get the
node to process, then it means we need to free that node
indirectly by calling xmlTextReaderNext. In that case, that node
should not be flagged by read_context::set_corpus_node. Add more
comments.
* src/abg-corpus.cc (corpus::is_empty): Do not crash when no
symtab is around.
* src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay):
Fix typo in comment.
(advance_to_next_sibling_element): Don't crash when given a nil
node.
* tests/data/test-abidiff/test-PR27616-squished-v0.abi: Add new
test input.
* tests/data/test-abidiff/test-PR27616-squished-v1.abi: Likewise.
* tests/data/test-abidiff/test-PR27616-v0.xml: Likewise.
* tests/data/test-abidiff/test-PR27616-v1.xml: Likewise.
* tests/data/Makefile.am: Add the new test inputs above to source
distribution.
* tests/test-abidiff.cc (specs): Add the new tests inputs above to
this harness.
* tools/abilint.cc (main): Support writing corpus groups.
dwarf-reader: properly set artificial-ness in opaque types
get_opaque_version_of_type forgets to set the "is-artificial" property
according to the initial type the opaque type is derived from. This
can lead to some instability in the abixml output.
Fixed thus.
* src/abg-dwarf-reader.cc (get_opaque_version_of_type): Propagate
the artificial-ness of the original type here.
* tests/data/test-read-dwarf/PR27700/test-PR27700.abi: Adjust.
dwarf-reader: Canonicalize opaque enums and classes
This issue was reported in bug https://sourceware.org/bugzilla/show_bug.cgi?id=27700.
When we construct an opaque type (triggered by the use of
--drop-private-types along with the --headers-dir option on abidw, for
instance) with get_opaque_version_of_type we forget to canonicalize
the resulting type.
Later, at abixml emitting time (for instance)
hash_as_canonical_type_or_constant would rightfully abort because the
type wasn't canonicalized. We want all types (okay, modulo one
exception) in the system to be canonicalized.
This patch fixes the problem by canonicalizing opaque types.
* src/abg-dwarf-reader.cc (build_ir_node_from_die): Canonicalize
opaque enums and classes.
* tests/data/test-read-dwarf/PR27700/include-dir/priv.h: New test
header file.
* tests/data/test-read-dwarf/PR27700/include-dir/pub.h: Likewise
* tests/data/test-read-dwarf/PR27700/pub-incdir/inc.h: Likewise.
* tests/data/test-read-dwarf/PR27700/test-PR27700.o: New binary
input file.
* tests/data/test-read-dwarf/PR27700/test-PR27700.abi: Reference
abi file of the binary above.
* tests/data/test-read-dwarf/PR27700/test-PR27700.c: Source file
of the binary above.
* tests/data/Makefile.am: Add the test material above to source
distribution.
* tests/test-read-dwarf.cc (InOutSpec::in_public_headers_path):
Add new data member.
(in_out_specs): Adjust to reflect the new data member in the
InOutSpec type. Add a new test input.
(set_suppressions_from_headers): Define new static function.
(test_task::perform): Use the content of the new
InOutSpec::in_public_headers_path to construct and add
"--headers-dir <headers-dir> --drop-private-types" to the options
of the abidw program run.
symtab: Add support for MODVERSIONS (CRC checksums)
The Linux Kernel has a mechanism (MODVERSIONS) to checksum symbols based
on their type. In a way similar to what libabigail does, but different.
The CRC values for symbols can be extracted from the symtab either by
following the __kcrctab_<symbol> entry or by using the __crc_<symbol>
value directly.
This patch adds support for extracting those CRC values and storing them
as a property of elf_symbol. Subsequently, 'crc' gets emitted as an
attribute of 'elf-symbol' in the XML representation.
CRC comparisons are also added to the abidiff machinery such that if
both representations of a comparison contain a CRC value, they will be
compared and if any of the values is unset (i.e. == 0), equality is
assumed. Differences will be reported in the format that the Kernel
presents them e.g. via Module.symvers. It is likely, but not necessary,
that a CRC difference comes along with an ABI difference reported by
libabigail. Not everything that leads to a change of the CRC value an
ABI breakage in the libabigail sense.
In case a function or variable symbol changes in a harmless way, we
would previously also consider a CRC change harmless (or more precise:
not harmful). Explicitly testing for CRC changes when analyzing whether
diff nodes need to be considered harmful, allows to still classify them
harmful.
Also add some test cases to ensure reading CRC values from kernel
binaries works as expected. The empty-report files have been
consolidated to one file: empty-report.txt. That also clarifies the
expected outcome for the affected tests.
* include/abg-ir.h (elf_symbol::elf_symbol): Add CRC parameter.
(elf_symbol::create): Likewise.
(elf_symbol::get_crc): New member method.
(elf_symbol::set_crc): New member method.
* src/abg-comp-filter.cc (crc_changed): New function.
(categorize_harmful_diff_node): Also test for CRC changes.
* src/abg-ir.cc (elf_symbol::priv::crc_): New data member.
* src/abg-ir.cc (elf_symbol::priv::priv): Add CRC parameter.
(elf_symbol::elf_symbol): Likewise.
(elf_symbol::create): Likewise.
(elf_symbol::textually_equals): Add CRC support.
(elf_symbol::get_crc): New member method.
(elf_symbol::set_crc): New member method.
* src/abg-reader.cc (build_elf_symbol): Add CRC support.
* src/abg-reporter-priv.cc (maybe_report_diff_for_symbol): Likewise.
* src/abg-symtab-reader.cc (symtab::load): Likewise.
* src/abg-writer.cc (write_elf_symbol): Likewise.
* tests/data/Makefile.am: Add new test data files.
* tests/data/test-abidiff-exit/test-crc-report.txt: New test file.
* tests/data/test-abidiff-exit/test-crc-v0.abi: Likewise.
* tests/data/test-abidiff-exit/test-crc-v1.abi: Likewise.
* tests/data/test-abidiff/empty-report.txt: New file.
* tests/data/test-abidiff/test-PR18166-libtirpc.so.report.txt: Deleted.
* tests/data/test-abidiff/test-PR24552-report0.txt: Deleted.
* tests/data/test-abidiff/test-crc-0.xml: New test file.
* tests/data/test-abidiff/test-crc-1.xml: Likewise.
* tests/data/test-abidiff/test-crc-2.xml: Likewise.
* tests/data/test-abidiff/test-crc-report.txt: Likewise.
* tests/data/test-abidiff/test-empty-corpus-report.txt: Deleted.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Add CRC values.
* tests/data/test-read-write/test-crc.xml: New test data file.
* tests/data/test-symtab/kernel-modversions/Makefile: New test source.
* tests/data/test-symtab/kernel-modversions/one_of_each.c: Likewise.
* tests/data/test-symtab/kernel-modversions/one_of_each.ko: Likewise.
* tests/test-abidiff-exit.cc: Add new test case.
* tests/test-abidiff.cc: Add new test case.
* tests/test-read-write.cc: Likewise.
* tests/test-symtab.cc (Symtab::KernelSymtabsWithCRC): New test case.
The function write_elf_symbol_reference iterates through aliases,
looking for an unsuppressed alias to use. The existing code went wrong
in the case when aliases are present. In the case of all symbols
suppressed, it would also have selected the last alias, rather than
the first, if the data structure invariants had matched the code's
expectations.
The main symbol is always distinguished. When aliases are absent, the
sole symbol's next pointer is null, but when aliases are present, they
form a circular list. This makes traversal of aliases a bit tricky.
The code now picks the first symbol from the following:
- the given symbol, if unsuppressed
- the main symbol, if unsuppressed
- the unsuppressed aliases in the remainder of the alias chain
- the main symbol (suppressed)
The given symbol, which need not be the same as the main symbol, will
be tested twice, if suppressed, but addressing this would make the
code even more elaborate and fragile.
The last case may be unreachable if symbol suppression triggers when
all aliases are suppressed.
I left this change stand-alone for easier review and to credit Giuliano for his
work on it, though it fixes a previous commit.
* src/abg-writer.cc (write_elf_symbol_reference): Check main
symbol and aliases with more care.
dwarf-reader/writer: consider aliases when dealing with suppressions
When the symbol of a decl is suppressed and it happens to be the main
symbol of a group of aliased symbols where another symbol is not
suppressed, the dwarf reader discards the decl from the internal
representation altogether upon reading and thus the writer will not be
able to connect that decl to the non-suppressed aliased elf symbol.
In order to address this, ensure we are not suppressing decls for which
an alias is not suppressed. We therefore keep the decl in the IR when at
least one its underlying aliased symbols is non-suppressed.
Likewise, when the abg-writer is having to attach an elf-symbol-id to
the decl, instead of omitting the symbol altogether, rather make use of
the property of aliases and connect the dwarf information to an alias
instead. This way the function dwarf information stays connected to the
elf symbol that we want to track.
When reading from XML with a symbol whitelist that leads to suppression
of aliased symbols, abidiff would hit an assertion and crash when
looking up the aliased symbol due to it being suppressed. In the new
symtab reader we can still suppress a symbol without removing it from
the lookup. Make use of that property to fix this bug.
A test has been added for this as well.
* src/abg-dwarf-reader.cc(function_is_suppressed): Do not suppress
a function for which there is an alias that is not suppressed.
(variable_is_suppressed): Likewise for variables.
* src/abg-reader.cc (build_elf_symbol): Improve handling of
suppressed aliased symbols when reading from XML.
* src/abg-symtab-reader.cc (load): Likewise.
* src/abg-writer.cc(write_elf_symbol_reference): Fall back to
any aliased symbol if the main symbol is suppressed.
* tests/data/Makefile.am: Add new test files.
* tests/data/test-abidiff-exit/test-missing-alias-report.txt: New test file.
* tests/data/test-abidiff-exit/test-missing-alias.abi: Likewise.
* tests/data/test-abidiff-exit/test-missing-alias.suppr: Likewise.
* tests/test-abidiff-exit.cc: Add support for whitelists and add
new testcase.
* tests/data/test-read-dwarf/test-suppressed-alias.c: New test file.
* tests/data/test-read-dwarf/test-suppressed-alias.o: Likewise.
* tests/data/test-read-dwarf/test-suppressed-alias.o.abi: Likewise.
* tests/data/test-read-dwarf/test-suppressed-alias.suppr: Likewise.
* tests/data/test-read-dwarf/test3-alias-1.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test3-alias-1.suppr: Likewise.
* tests/data/test-read-dwarf/test3-alias-2.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test3-alias-2.suppr: Likewise.
* tests/data/test-read-dwarf/test3-alias-3.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test3-alias-3.suppr: Likewise.
* tests/data/test-read-dwarf/test3-alias-4.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test3-alias-4.suppr: Likewise.
* tests/test-read-dwarf.cc: Add new test cases.
symtab/dwarf-reader: allow hinting of main symbols for aliases
In case of aliased symbols, the "main symbol" cannot be deduced from
the symtab as this solely contains a name->addr mapping and aliases
are represented by multiple names resolving to the same address.
Therefore the main symbol can only be picked rather randomly and
unpredictable.
Unlike DWARF, which contains a single symbol entry for only the main
symbol. Hence we can (late) detect the main symbol. Exploiting that
property allows to correct the addr->symbol lookup in the symtab to
return the correct main symbol and it also allows to correct the aliased
symbols to maintain the correct main symbol.
This patch adds the `update_main_symbol` functionality to `elf_symbol`
to update the main symbol by name (ELF symbols need unique names) and
adds `update_main_symbol` to `symtab` that makes use of said new method.
When we discover a main symbol during DWARF reading, we instruct the
symtab to update the mapping.
This creates consistent representations across different builds of the
same binary with the same ABI by pinning down the main symbol to the
defined one. Knowing the main symbol also helps to keep the correct
dwarf information in the representation in the presence of symbol
suppressions. A later patch will address that.
Some test cases in tests/data need adjustment and they have all been
verified to be valid changes.
- main symbol changed for various elf symbols
- test-annotate/test15-pr18892.so.abi
- test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi
- test-annotate/test3.so.abi
- test-read-dwarf/test15-pr18892.so.abi
- test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi
- test-read-dwarf/test3.so.abi
- test-read-dwarf/test3.so.hash.abi
- due to main symbol changes, the symbol diff needs to be corrected
- test-diff-dwarf/test12-report.txt
- test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt
- test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt
- the test scenario needed adjustments as the main symbol changed
- test-diff-suppr/test23-alias-filter-4.suppr
- test-diff-suppr/test23-alias-filter-report-0.txt
- test-diff-suppr/test23-alias-filter-report-2.txt
As usual, the complete changelog follows.
* include/abg-ir.h (elf_symbol::update_main_symbol): New method.
* include/abg-symtab-reader.h (symtab::update_main_symbol): New method.
* src/abg-dwarf-reader.cc
(build_var_decl): Hint symtab about main symbol discovered in DWARF.
(build_function_decl): Likewise.
* src/abg-ir.cc (elf_symbol::get_main_symbol): Lock the weak_ptr
on access in both overloads.
(update_main_symbol): New method to allow updating the main symbol.
* src/abg-symtab-reader.cc (symtab::update_main_symbol): New method.
* data/Makefile.am: Add new test data files.
* tests/data/test-annotate/test15-pr18892.so.abi: Updated test file.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-annotate/test2.so.abi: Likewise.
* tests/data/test-annotate/test3.so.abi: Likewise.
* tests/data/test-diff-dwarf/test12-report.txt: Likewise.
* tests/data/test-diff-dwarf/test42-PR21296-clanggcc-report0.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Likewise.
* tests/data/test-diff-suppr/test23-alias-filter-4.suppr: Likewise.
* tests/data/test-diff-suppr/test23-alias-filter-report-0.txt: Likewise.
* tests/data/test-diff-suppr/test23-alias-filter-report-2.txt: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-read-dwarf/test2.so.abi: Likewise.
* tests/data/test-read-dwarf/test2.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise.
* tests/data/test-read-dwarf/test3.so.abi: Likewise.
* tests/data/test-read-dwarf/test3.so.hash.abi: Likewise.
* tests/data/test-symtab/basic/aliases.c: New test source file.
* tests/data/test-symtab/basic/aliases.so: Likewise.
* tests/test-symtab.cc (Symtab::AliasedFunctionSymbols): New test case.
(Symtab::AliasedVariableSymbols): Likewise.
Extend the test functionality in test-symtab to allow processing of KMI
whitelists and add additional test cases for whitelist handling.
* tests/data/Makefile.am: add new test files
* tests/data/test-symtab/basic/one_function_one_variable_all.whitelist: New test file,
* tests/data/test-symtab/basic/one_function_one_variable_function.whitelist: Likewise.
* tests/data/test-symtab/basic/one_function_one_variable_irrelevant.whitelist: Likewise.
* tests/data/test-symtab/basic/one_function_one_variable_variable.whitelist: Likewise.
* tests/test-symtab.cc (read_corpus): Add support for whitelists.
(assert_symbol_count): Likewise.
(Symtab::SymtabWithWhitelist): New testcase.
dwarf reader: drop (now) unused code related to symbol table reading
The introduction of the new symtab reader incorporated much of the
existing functionality. Now that the most code parts are migrated to the
new symtab reader, we can safely remove the old code paths.
Ignoring the symbol table is not a thing anymore. The new symtab reader
does read the symtab unconditionally for consistency reasons. Hence also
remove all functionality around conditional symtab reading.
With the prework in previous commits, we are now able to drop the
public symbols maps in corpus::priv and replace them by private members
with access through getters. The getters use the new symtab
implementation to generate the maps on the fly. Setters are not required
anymore and are removed. Also remove redundant getters.
We could also remove the getters for the symbol maps and the local
caching variable and leave it all to lookup_symbol, but this is left for
a later change.
* include/abg-corpus.h (corpus::set_fun_symbol_map): Remove
method declaration.
(corpus::set_undefined_fun_symbol_map): Likewise.
(corpus::set_var_symbol_map): Likewise.
(corpus::set_undefined_var_symbol_map): Likewise.
(corpus::get_fun_symbol_map_sptr): Likewise.
(corpus::get_undefined_fun_symbol_map_sptr): Likewise.
(corpus::get_var_symbol_map_sptr): Likewise.
(corpus::get_undefined_var_symbol_map_sptr): Likewise.
* src/abg-corpus-priv.h (corpus::priv::var_symbol_map): make
private and mutable
(corpus::priv::undefined_var_symbol_map): Likewise.
(corpus::priv::fun_symbol_map): Likewise.
(corpus::priv::undefined_fun_symbol_map): Likewise.
(corpus::priv::get_fun_symbol_map): New method declaration.
(corpus::priv::get_undefined_fun_symbol_map): Likewise.
(corpus::priv::get_var_symbol_map): Likewise.
(corpus::priv::get_undefined_var_symbol_map): Likewise.
* src/abg-corpus.cc (corpus::priv::get_fun_symbol_map): New
method implementation.
(corpus::priv::get_undefined_fun_symbol_map): Likewise.
(corpus::priv::get_var_symbol_map): Likewise.
(corpus::priv::get_undefined_var_symbol_map): Likewise.
(corpus::is_empty): depend on symtab only.
(corpus::set_fun_symbol_map): Remove method.
(corpus::set_undefined_fun_symbol_map): Likewise.
(corpus::set_var_symbol_map): Likewise.
(corpus::set_undefined_var_symbol_map): Likewise.
(corpus::get_fun_symbol_map_sptr): Likewise.
(corpus::get_undefined_fun_symbol_map_sptr): Likewise.
(corpus::get_var_symbol_map_sptr): Likewise.
(corpus::get_undefined_var_symbol_map_sptr): Likewise.
(corpus::get_fun_symbol_map): Use corpus::priv proxy method.
(corpus::get_undefined_fun_symbol_map): Likewise.
(corpus::get_var_symbol_map): Likewise.
(corpus::get_undefined_var_symbol_map): Likewise.
* src/abg-dwarf-reader.cc (read_debug_info_into_corpus): Do not
set corpus symbol maps anymore.
* src/abg-reader.cc (read_corpus_from_input): Likewise.
* tests/test-symtab.cc (assert_symbol_count): Do not access the
corpus symbol maps through sptr anymore.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Adjust
expected test output.
symtab_reader: add support for ppc64 ELFv1 binaries
When loading the symtab from an ppc64 binary, also keep track of the
function entry addresses as a key for the symbol lookup. That
accommodates the differences in DWARF pointing to the function entry
address while the symbol table points to the function pointer.
The implementation is mostly copied and adopted from abg-dwarf-reader's
read_context to add this functionality also to the new symtab reader.
* src/abg-symtab-reader.cc (symtab::lookup_symbol): fall back to
lookup the address in entry_addr_symbol_map_.
(symtab::load): update the function entry address map for
ppc64 targets.
(symtab::update_function_entry_address_symbol_map): New
function implementation.
* src/abg-symtab-reader.h
(symtab::entry_addr_symbol_map_): New data member.
(symtab::update_function_entry_address_symbol_map): New
function declaration.
This migrates more helpers to abg-elf-helpers:
lookup_ppc64_elf_fn_entry_point_address with dependencies
read_uint64_from_array_of_bytes
read_int_from_array_of_bytes
address_is_in_opd_section with dependency
address_is_in_section
read_context::find_opd_section and read_context::opd_section_ are obsolete.
Switch kernel stuff over to new symtab and drop unused code
Now that the new symtab implementation is capable of reading the
ksymtab, we can switch over the implementation to gather information
from there and delete all now-obsolete code.
dwarf-reader: read_context: use new symtab in *_symbols_is_exported
Testing whether a symbol is exported can be simplified using the new
symtab implementation. The same holds true for whether a symbol is
exported via ksymtab in case of linux kernel binaries. So, do that.
* src/abg-dwarf-reader.cc (function_symbol_is_exported): Use new
symtab implementation.
(variable_symbol_is_exported): Likewise.
abg-reader: avoid using the (var|function)_symbol_map
Instead of using the corpus var|function_symbol_maps for symbol lookups,
let build_elf_symbol_from_reference use the symtab::lookup_symbol
method. That leads to a shorter implementation and we can drop the
indicative parameter.
* src/abg-reader.cc (build_elf_symbol_from_reference): drop
last parameter indicating the lookup type and use corpus
symtab for the lookup
(build_function_decl): Adjust accordingly.
(build_var_decl): Likewise.
corpus: make get_unreferenced_(function|variable)_symbols use the new symtab
Make the corresponding members an implementation detail of corpus::priv.
They get computed based on the new symtab whenever they are accessed
first with an atomic instantiation. That simplifies the implementation
and homogenizes the access to functions and variables. Sorting does not
need to be done as the symtab already gives a guarantee for that.
* src/abg-corpus-priv.h (corpus::priv::unrefed_var_symbols): make
private, mutable and optional.
(corpus::unrefed_fun_symbols): Likewise.
(corpus::priv::get_unreferenced_function_symbols): New method declaration.
(corpus::priv::get_unreferenced_variable_symbols): Likewise.
* src/abg-corpus.cc
(corpus::priv::build_unreferenced_symbols_tables): Delete method.
(corpus::priv::get_unreferenced_function_symbols): New method implementation.
(corpus::priv::get_unreferenced_variable_symbols): Likewise.
(corpus::get_unreferenced_function_symbols): Proxy call to corpus::priv.
(corpus::get_unreferenced_variable_symbols): Likewise.
corpus: make get_(undefined_)?_(var|fun)_symbols use the new symtab
Make the corresponding members an implementation detail of corpus::priv.
They get computed based on the new symtab whenever they are accessed
first with an atomic instantiation. That simplifies the implementation
and homogenizes the access to functions and variables. Sorting does not
need to be done as the symtab already gives a guarantee for that.
Due to improved alias detection in the new symtab reader, ensure we only
write symbol aliases to ksymtab symbols if having a ksymtab main symbol.
Test data needed to be adjusted as the new symtab reader is stricter in
regards to symbols listed in ksymtab. I.e. init_module is not an
exported symbol in the ksymtab of a kernel module.
* src/abg-corpus-priv.h (corpus::priv::sorted_var_symbols): make
private, mutable and optional.
(corpus::sorted_undefined_var_symbols): Likewise.
(corpus::sorted_fun_symbols): Likewise.
(corpus::sorted_undefined_fun_symbols): Likewise.
(corpus::priv::get_sorted_fun_symbols): New method declaration.
(corpus::priv::get_sorted_undefined_fun_symbols): Likewise.
(corpus::priv::get_sorted_var_symbols): Likewise.
(corpus::priv::get_sorted_undefined_var_symbols): Likewise.
* src/abg-corpus.cc
(corpus::elf_symbol_comp_functor): Delete struct.
(corpus::priv::get_sorted_fun_symbols): New method implementation.
(corpus::priv::get_sorted_undefined_fun_symbols): Likewise.
(corpus::priv::get_sorted_var_symbols): Likewise.
(corpus::priv::get_sorted_undefined_var_symbols): Likewise.
(corpus::get_sorted_fun_symbols): Proxy call to corpus::priv.
(corpus::get_sorted_undefined_fun_symbols): Likewise.
(corpus::get_sorted_var_symbols): Likewise.
(corpus::get_sorted_undefined_var_symbols): Likewise.
* src/abg-writer.cc (write_elf_symbol_aliases): When emitting
aliases for a kernel symbol, ensure to only emit exported,
public aliases.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: update test
data.
Integrate new symtab reader into corpus and read_context
While reading the corpus in the read_context, also load the new type
symtab object side-by-side and set it accordingly in the resulting
corpus. This is still side by side and passive code that gets active in
the following changes. This is applicable for the dwarf reader as well
as for the reader that consumes XML.
* include/abg-corpus.h (corpus::set_symtab): New method declaration.
(corpus::get_symtab): New method declaration.
* include/abg-fwd.h (symtab_reader::symtab_sptr): New forward
declaration.
* src/abg-corpus-priv.h (corpus::priv::symtab_): New data member.
* src/abg-corpus.cc
(corpus::set_symtab): Likewise.
(corpus::get_symtab): Likewise.
* src/abg-dwarf-reader.cc (read_context::symtab_): New data member.
(read_context::initialize): reset symtab_ as well
(read_context::symtab): new method that loads a symtab on
first access and returns it.
(read_debug_info_into_corpus): also set the new symtab object
on the current corpus.
(read_corpus_from_elf): Also determine (i.e. load) the new
symtab object and contribute to the load status.
* src/abg-reader.cc (read_corpus_from_input): also set the new
type symtab when reading from xml.
* tests/test-symtab.cc: Add test assertions.
Refactor ELF symbol table reading by adding a new symtab reader
Based on existing functionality, implement the reading of ELF symbol
tables as a separate component. This reduces the complexity of
abg-dwarf-reader's read_context by separating and delegating the
functionality. This also allows dedicated testing.
The new namespace symtab_reader contains a couple of new components that
work loosely coupled together. Together they allow for a consistent view
on a symbol table. With filter criteria those views can be restricted,
iterated and consistent lookup maps can be built on top of them. While
this implementation tries to address some shortcomings of the previous
model, it still provides the high level interfaces to the symbol table
contents through sorted iterating and name/address mapped access.
symtab_reader::symtab
While the other classes in the same namespace are merely helpers, this
is the main implementation of symtab reading and storage.
Symtab objects are factory created to ensure a consistent construction
and valid invariants. Thus a symtab will be loaded by either passing
an ELF handle (when reading from binary) or by passing a set of
function/variable symbol maps (when reading from XML).
When constructed they are considered const and are not writable
anymore. As such, all public methods are const.
The load reuses the existing implementation for loading symtab
sections, but since the new implementation does not distinguish
between functions and variables, the code could be simplified. The
support for ppc64 function entry addresses has been deferred to a
later commit.
Linux Kernel symbol tables are now directly loaded by name when
encountering symbols prefixed with the __ksymtab_ as per convention.
This has been tricky in the past due to various different binary
layouts (relocations, position relative relocations, symbol
namespaces, CFI indirections, differences between vmlinux and kernel
modules). Thus the new implementation is much simpler and is less
vulnerable to future ksymtab changes. As we are also not looking up
the Kernel symbols by addresses, we could resolve shortcomings with
symbol aliasing: Previously a symbol and its alias were
indistinguishable as they are having the same symbol address. We could
not identify the one that is actually exported via ksymtab.
One major architectural difference of this implementation is that we
do not early discard suppressed symbols. While we keep them out of the
vector of exported symbols, we still make them available for lookup.
That helps addressing issues when looking up a symbol by address (e.g.
from the ksymtab read implementation) that is suppressed. That would
fail in the existing implementation.
Still, we intend to only instantiate each symbol once and pass around
shared_ptr instances to refer to it from the vector as well as from
the lookup maps.
For reading, there are two access paths that serve the existing
patterns:
1) lookup_symbol: either via a name or an address
2) filtered iteration with begin(), end()
The former is used for direct access with a clue in hand (like a name
or an address), the latter is used for iteration (e.g. when emitting
the XML).
symtab_reader::symtab_iterator
The symtab_iterator is an STL compatible iterator that is returned
from begin() and end() of the symtab. It allows usual forward iterator
operations and can optionally take a filter predicate to skip non
matching elements.
symtab_reader::symtab_filter
The symtab_filter serves as a predicate for the symtab_iterator by
providing a matches(const elf_symbol_sptr&) function. The predicate
is built by ANDing together several conditions on attributes a symbol
can have. The filter conditions are implemented in terms of
std::optional<bool> members to allow a tristate: "needs to have the
condition set", "must not have it set" and "don't care".
symtab_reader::filtered_symtab
The filtered_symtab is a convenience zero cost abstraction that allows
prepopulating the symtab_filter (call it a capture) such that begin()
and end() are now accessible without the need to pass the filter
again. Argumentless begin() and end() are a requirement for range-for
loops and other STL based algorithms.
Dodji Seketeli [Wed, 31 Mar 2021 12:56:52 +0000 (14:56 +0200)]
Bug 27598 - abidiff mishandles union member functions
abidiff segfaults when a union member function is involved in the
comparison. This patch fixes that.
* src/abg-default-reporter.cc (default_reporter::report): Assume
the parent type of the method can be either a class or a union.
* tests/data/test-diff-filter/test-PR27598-report-0.txt: New
reference output for the test.
* tests/data/test-diff-filter/test-PR27598-v{0,1}.cc: New source
code for the input binaries below.
* tests/data/test-diff-filter/test-PR27598-v{0,1}.o: New input
test binaries.
* tests/data/Makefile.am: Add the new test material to source
distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the test inputs
above to this test harness.
Dodji Seketeli [Thu, 25 Mar 2021 10:45:57 +0000 (11:45 +0100)]
Bug 27569 - abidiff misses a function parameter addition
Adding or deleting function parameters are changes that have not yet
been categorized in the comparison engine. As a result, those changes
can be considered harmless and thus be filtered out. Oops.
This patch categorizes function addition and removal as
FN_PARM_ADD_REMOVE_CHANGE_CATEGORY, which is a new category being
introduced. Changes in this category are considered harmful and are
thus always reported by default.
* include/abg-comparison.h (enum diff_category): Add a new
FN_PARM_ADD_REMOVE_CHANGE_CATEGORY enumerator and adjust the
following enumerator values. Update the EVERYTHING_CATEGORY
accordingly.
(function_type_diff::{sorted_deleted_parms, sorted_added_parms}):
Add new member functions.
* src/abg-comparison.cc
(function_type_diff::{sorted_deleted_parms, sorted_added_parms}):
Define new accessors.
(get_default_harmful_categories_bitmap):
Consider changes of the category
FN_PARM_ADD_REMOVE_CHANGE_CATEGORY as harmful.
(operator<<(ostream& o, diff_category c)): Support the new
FN_PARM_ADD_REMOVE_CHANGE_CATEGORY while serializing a category
bitmap.
* src/abg-comp-filter.cc
(has_added_or_removed_function_parameters): Define new static
function.
(categorize_harmful_diff_node): Categorize diff nodes representing
function parameter addition or removal as
FN_PARM_ADD_REMOVE_CHANGE_CATEGORY.
* tests/data/test-diff-filter/test-PR27569-report-0.txt: New test
reference output.
* tests/data/test-diff-filter/test-PR27569-v{0,1}.abi: New test inputs.
* tests/data/Makefile.am: Add the new test inputs to the source
distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the new test
inputs to this test harness.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt:
Adjust.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt:
Likewise.
* tests/data/test-diff-pkg/libsigc++-2.0-0c2a_2.4.0-1_amd64--libsigc++-2.0-0v5_2.4.1-1ubuntu2_amd64-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt:
Likewise.
The optional zip archive feature was broken when the concept of
environment was introduced by commit b2e5366d3 back in 2015. Since it
wouldn't even compile and nobody noticed, we are fairly sure nobody
uses that feature. Therefore, we decided to remove it rather than fix
it.
* configure.ac: remove --enable-zip-archive option and logic
associated with it.
* include/abg-libzip-utils.h: Remove.
* src/abg-libzip-utils.cc: Likewise.
* include/Makefile.am: remove reference to include/abg-libzip-utils.h
that no longer exists.
* src/Makefile.am: remove reference to src/abg-libzip-utils.cc that no
longer exists.
* relicensing-scripts/file-licenses.orig.txt: remove references to
files that no longer exist.
* relicensing-scripts/files-with-lgplv3.txt: remove references to
files that no longer exist.
* tests/test-write-read-archive.cc: Remove because it tests code
that no longer exists.
* tests/Makefile.am: remove reference to tests that no longer exist.
* include/abg-reader.h: remove conditionally compiled code for zip
archives.
* include/abg-tools-utils.h: remove conditionally compiled code for
zip archives.
* src/abg-corpus.cc: remove conditionally compiled code for zip
archives.
* src/abg-reader.cc: remove conditionally compiled code for zip
archives.
* src/abg-tools-utils.cc: remove conditionally compiled code for zip
archives.
* src/abg-writer.cc: remove conditionally compiled code for zip
archives.
* tools/abidiff.cc: remove conditionally compiled code for zip
archives.
* tools/abilint.cc: remove conditionally compiled code for zip
archives.
* tools/abiar.c: Remove.
* tools/Makefile.am: remove references to abiar a utility that no
longer has a reason for existing.
Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Dodji Seketeli [Thu, 18 Mar 2021 10:58:44 +0000 (11:58 +0100)]
dwarf-reader: Support more DWARF-5 type DIEs
When analyzing DWARF-5 some binaries, is_type_tag chokes on the new
DWARF-5 type DIEs it doesn't know about.
This patch teaches it about them.
* src/abg-dwarf-reader.cc (is_type_tag): Support
DW_TAG_coarray_type, DW_TAG_atomic_type and DW_TAG_immutable_type.
* tests/data/test-diff-pkg/elfutils-debuginfo-0.183-1.el9.x86_64.rpm:
Add new binary test input.
* tests/data/test-diff-pkg/elfutils-libs-debuginfo-0.183-1.el9.x86_64.rpm: Likewise.
* tests/data/test-diff-pkg/elfutils-libs-0.183-1.el9.x86_64.rpm: Likewise.
* tests/data/test-diff-pkg/elfutils-libs-debuginfo-0.183-1.el9.x86_64-self-check-report-0.txt:
Add new reference test output.
* tests/test-diff-pkg.cc (in_out_specs): Add the test inputs above
to the harness.
Bug 27552 - libabigail needs to interpret ARM32 symbol addresses specially
The previous commit omitted any code commentary. This adds a link to
the relevant reference. The code is shortened slightly as well.
* src/abg-dwarf-reader.cc
(read_context::load_symbol_maps_from_symtab_section): Add
descriptive comment to ARM32 address handling; shorten
the assignment using &=.
dwarf-reader split: create abg-symtab-reader.{h,cc} and test case
abg-symtab-reader.{h,cc} shall contain the refactored symtab reader.
Create the stub files, an empty unit test and hook everything up in the
make system.
* src/abg-symtab-reader.h: New header file.
* src/abg-symtab-reader.cc: New source file.
* src/Makefile.am: Add new source files.
* tests/Makefile.am: Add new test case runtestsymtabreader.
* tests/test-symtab-reader.cc: New test source file.
Bug 27552 - libabigail needs to interpret ARM32 symbol addresses specially
The ARM32 ELF specification specifies that bit 0 of an ELF function
address is a flag specifying whether the instructions are Thumb or
ARM. So clear this bit before using the addresses for symbol mapping.
* src/abg-dwarf-reader.cc
(read_context::load_symbol_maps_from_symtab_section): Clear
bit zero of ARM32 function addresses.
* src/abg-elf-helpers.cc (architecture_is_arm32): Add new
function.
* src/abg-elf-helpers.h (architecture_is_arm32): Likewise.
* tests/data/test-read-dwarf/test-libandroid.so.abi: Update.
In the context of libabigail and a single library run (when reading from
dwarf or from xml), a symbol is either suppressed or it is not. While
one could argue that this is a property of the read_context, the
read_context might not be around anymore when the symbol still is.
Hence, persist the 'is_suppressed' state along with the symbol itself.
* include/abg-ir.h (elf_symbol::elf_symbol): Add is_suppressed
parameter.
(elf_symbol::create): Likewise.
(elf_symbol::is_suppressed): New getter declaration.
(elf_symbol::set_is_suppressed): New setter declaration.
* src/abg-ir.cc (elf_symbol::priv::priv): Add is_suppressed
parameter.
(elf_symbol::priv::is_suppressed_): New field.
(elf_symbol::elf_symbol): Add is_suppressed parameter.
(elf_symbol::create): Likewise.
(elf_symbol::is_suppressed): New getter implementation.
(elf_symbol::set_is_suppressed): New setter implementation.
Being exported through a ksymtab (in case of Linux Kernel binaries) is
actually a property of the Elf symbol itself and we can therefore track
it along with the symbol that we collect from symtab. While tracking is
currently done by keeping separate symbol lists and maps for symtab and
ksymtab symbols, they can be consolidated having a property to indicate
whether this symbol also appeared as a ksymtab entry.
Hence, and for future changes in this area, add this property and update
all references. The flag is false initially unless otherwise specified.
* include/abg-ir.h (elf_symbol::elf_symbol): Add is_in_ksymtab
parameter.
(elf_symbol::create): Likewise.
(elf_symbol::is_in_ksymtab): New getter declaration.
(elf_symbol::set_is_in_ksymtab): New setter declaration.
* src/abg-ir.cc (elf_symbol::priv::priv): Add is_in_ksymtab
parameter.
(elf_symbol::priv::is_in_ksymtab_): New field.
(elf_symbol::elf_symbol): Add is_in_ksymtab parameter.
(elf_symbol::create): Likewise.
(elf_symbol::is_in_ksymtab): New getter implementation.
(elf_symbol::set_is_in_ksymtab): New setter implementation.
abg-cxx-compat: add simplified version of std::optional
In the absence (but desire) of std::optional<T>, add a simplified
version of it to abg_compat:: in case we are compiling with a pre-C++17
standard. Otherwise use std::optional from <optional> directly.
This is being used by a later patch and serves as a prerequisite.
It only serves the purpose of being a compatibility implementation and
does not claim to be complete at all. Just enough for the project's
needs.
* include/abg-cxx-compat.h (abg_compat::optional): Add new class.
* tests/tests-cxx-compat.cc: Add new test cases.
Fix declaratons of conditionally defined functions
Functions relating to zip archives are declared but are never compiled
when --enable-zip-archive=no, the default.
This makes sure that they are not declared when they won't be defined
due to conditional compilation.
* include/abg-reader.h (read_corpus_from_file): Guard the
declaration of these overloads with #ifdef WITH_ZIP_ARCHIVE.
* tools/abilint.cc: Guard the use of
abigail::xml_reader::read_corpus_from_file with #ifdef
WITH_ZIP_ARCHIVE.
Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Dodji Seketeli [Mon, 8 Mar 2021 12:04:13 +0000 (13:04 +0100)]
Revert "Fix declaratons of conditionally defined functions"
I forgot to edit the commit message of this commit to make it comply
with the rules in https://sourceware.org/git/?p=libabigail.git;a=blob;f=COMMIT-LOG-GUIDELINES.
Dodji Seketeli [Thu, 25 Feb 2021 11:44:13 +0000 (12:44 +0100)]
dwarf-reader: Keep stable order when de-duplicating class definitions
During de-duplication of class definition while resolving decl-only
classes to their definition, the order in which classes of the same
name are compared is not always the same. That results in an
instability of the particular class being kept. This can have an
impact when some classes have member types because member types are
not meaningful during comparison; so in the end that can lead to
spurious order instability during ABIXML serialization.
* src/abg-dwarf-reader.cc
(read_context::resolve_declaration_only_classes): Compare the
classes that have the same name across several TU, always in the
same order.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
Dodji Seketeli [Tue, 23 Feb 2021 16:00:19 +0000 (17:00 +0100)]
tests: Update to catch.hpp v2.13.4 and fix #2178
This patch is about fixing a compilation error that we are seeing on
Fedora Rawhide with glibc 2.33.9000, which makes MINSIGSTKSZ not be a
constant value anymore. Thus, the sigStackSize variable that is
declared constexpr cannot use that MINSIGSTKSZ as initializer anymore.
So as suggested in the issue
https://github.com/catchorg/Catch2/issues/2178 filed against
'catchorg' for this purpose, I am hardwiring the initialization value
of sigStackSize for now.
* tests/lib/catch.hpp: Update to v2.13.4 and initialize
sigStackSize to 32768 for now, as suggested by
https://github.com/catchorg/Catch2/issues/2178.
The definition of those two types can be slightly different and yet be
equivalent.
For instance, the data member of a struct S might be defined once as
having a type which is a typedef Foo of, say, "long int" and that
struct S might be defined again elsewhere with a data member of type
typedef Bar of "long int" as well.
With the current code, because Foo and Bar have different names, they
are are going to compare different; so the two struct S are doing to
compare different even though they are equivalent.
Down the road, this is likely going to imply that the several 'struct
S' that are declaration-only will not be resolved as being
declarations of the definition of "struct S", because they is more
than one such definition, and those definitions are different.
This is going to lead to spurious (and hard to debug) type differences
that are going to be detected and reported by libabigail later down
the road.
This patch addresses the problem by not taking typedef names into
account when comparing typedefs before canonicalization. That allows
the comparison of classes and structs that happens during the
resolution of declaration-only classes to correctly deduce their
equivalence even in cases like the one exposed above. It thus reduces
the amount of declaration-only classes that are unnecessarily
considered to be different from the definition they ought to equal.
* include/abg-ir.h (maybe_compare_as_member_decls): Declare new
function. Make it a friend of class decl_base.
* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Don't early
canonicalize typedefs because they can be "part" of a type that is
not yet completed, especially considering that class declaration
resolution is part of type building, stricto sensu.
* src/abg-ir.cc (maybe_compare_as_member_decls): Factorize this
out of ...
(equals): ... the overload for decl_base. Use it in the overload
for typedef_decl.
* tests/data/test-diff-pkg/nmap-7.70-5.el8_testjcc.x86_64-self-check-report-0.txt:
New test reference output.
* tests/data/test-diff-pkg/nmap-7.70-5.el8_testjcc.x86_64.rpm: New
binary input.
* tests/data/test-diff-pkg/nmap-debuginfo-7.70-5.el8_testjcc.x86_64.rpm: Likewise.
* tests/data/Makefile.am: Add these new testing material to source
distribution.
* tests/test-diff-pkg.cc (in_out_specs): Add the new test input to
the harness.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-0.txt:
Adjust.
* tests/data/test-diff-suppr/test39-opaque-type-report-0.txt:
Adjust.
Dodji Seketeli [Wed, 10 Feb 2021 17:43:52 +0000 (18:43 +0100)]
Use generic internal type name to canonicalize anonymous enums
This is from the problem report in Red Hat bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1924624
"comparing 'libpinyin.so.13.0.0' to itself wrongly yielded result"
During the canonicalization of an anonymous enum, the algorithm uses
its internal pretty representation to limit the number of types to
compare it to. That internal pretty representation is based on its
type name.
For anonymous types, the type name is not unique; it's constructed for
internal purposes that are different from the purpose of
canonicalization. So using that in the pretty representation might
negatively impact the accuracy of the canonicalization; it might make
it so that two anonymous in the same namespace types might wrongly be
considered canonically different.
To fix that, this change makes the internal pretty representation of
anonymous enum types essentially be "enum
<namespace-name>::__anonymous_enum__".
This is on part with what is done for unions and classes in commit: 005ab5c9 Use flat representation to canonicalize anonymous classes and unions
* src/abg-ir.cc (has_generic_anonymous_internal_type_name) :
Define new static function.
(get_generic_anonymous_internal_type_name): Use it here.
(decl_base::get_pretty_representation): For internal purposes,
build an anonymous name that is stable.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-diff-pkg/PR24690/PR24690-report-0.txt: Adjust.
* tests/data/test-diff-pkg/nss-3.23.0-1.0.fc23.x86_64-report-0.txt: Adjust.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust.
* tests/data/test-read-dwarf/test-libandroid.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.
Dodji Seketeli [Mon, 8 Feb 2021 11:03:19 +0000 (12:03 +0100)]
dwarf-reader: Use DW_FORM_line_strp only if it's present
* configure.ac: Define if HAS_DW_FORM_line_strp if the
DW_FORM_line_strp enumerator is present.
* src/abg-dwarf-reader.cc (form_is_DW_FORM_line_strp): Define new
static function.
(compare_dies_string_attribute_value): Use it.
Dodji Seketeli [Thu, 4 Feb 2021 08:26:05 +0000 (09:26 +0100)]
Bug 27267 - Better support for opaque enum types
Upon a request to build the IR for a opaque enum type,
get_opaque_version_of_type wrongly returns a nil type even though the
code is in there to construct an opaque variant of the enum.
This patch cleans up that code to let it build an opaque enum type.
It also ensures the opaque enum type is properly added to its lexical
scope to ensure proper life cycle management.
* src/abg-dwarf-reader.cc (get_opaque_version_of_type): Do not
quit early for enum types, because the code that comes a bit later
can handle enums. Add the newly built enum to its scope for
proper life cycle management.
* tests/data/test-diff-suppr/PR27267/include-dir-v{0,1}/include.h: New
include files for the input test library.
* tests/data/test-diff-suppr/PR27267/libtestpr27267-v{0,1}.so: New
input test library.
* tests/data/test-diff-suppr/PR27267/report-1.txt: New reference
output for the comparison.
* tests/data/test-diff-suppr/PR27267/v{0,1}.c: Source code for the
new input test library.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-suppr.cc (in_out_specs): Add the new test input
above to the test harness.
Dodji Seketeli [Wed, 3 Feb 2021 12:47:46 +0000 (13:47 +0100)]
Bug 27331 - Data member offset change not considered local
The comparison code fails to consider that a data member which offset
changed (and which type didn't change) constitutes a local change of
the enclosing class type.
Fixed thus.
* src/abg-ir.cc (equals): In the overload of class_or_union, when
a data member changes without having its type change, then
consider the data change as being local.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Adjust.
* tests/data/test-diff-filter/test-PR27331-report-0.txt: New
reference output.
* tests/data/test-diff-filter/test-PR27331-v{0,1}.c: New test
source files.
* tests/data/test-diff-filter/test-PR27331-v{0,1}.o: New test
binary inputs.
* tests/data/Makefile.am: Add these new test material to source
distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the tests above to
the harness.
Dodji Seketeli [Tue, 2 Feb 2021 11:35:54 +0000 (12:35 +0100)]
Bug 27165 - Better support multi-language binaries
In some binaries, a DIE named I originating from a compilation unit
written in the C++ language can be an implementation of a DIE named S
(linked through the DW_AT_specification attribute on I) originating
from a compilation unit written in, say, the C language.
In that case, when are we looking at I and try to get the scope of S
(which the DWARF reader considers as the logical scope of I)
get_scope_for_die needs to look at the DW_AT_language attribute
carried by the compilation unit DIE of S. At the moment, we wrongly
look at the DW_AT_language carried by the compilation unit DIE of I
and we deduce the wrong language (C++ instead of C).
This patch fixes that.
* src/abg-dwarf-reader.cc (get_scope_for_die): Get the language of
the DIE from the compilation unit of the DIE itself.
* tests/data/test-types-stability/PR27165-libzmq.so.5.2.3: New
test input.
* tests/data/test-types-stability/PR27165-libzmq.so.5.2.3.debug:
Debug information for the new test input.
* tests/data/Makefile.am: Add the test inputs above to the source
distribution.
* tests/test-types-stability.cc (elf_paths): Add the new test
inputs to this test harness.
Dodji Seketeli [Thu, 22 Oct 2020 14:04:08 +0000 (16:04 +0200)]
Bump ABIXML format version to 2.0
After fixing the interpretation of the DW_AT_bit_offset attribute for
offset of bit field data members, serialized abixml might now be
incompatible with versions of Libabigail that use the previous
interpretation.
That means that comparing an abixml file generated with previous
versions of Libabigail against a corpus resulting from an analysis
performed with the current version of libabigail might yield spurious
changes due to the differences in the way we now interpret the
DW_AT_bit_offset.
Hence, this patch bumps the version of abixml files emitted from now
on to "2.0". This version is deemed incompatible with the previous
"1.0" version.
Subsequently, an abixml file of the "1.0" format cannot be compared
against an abixml file of the "2.0" format, or against a binary
analyzed with a current version of Libabigail.
It's thus advised that abixml files of the "1.0" format version should
be re-generated with a current version of Libabigail, bumping their
format version number to the new "2.0".
* include/abg-corpus.h (corpus::init_format_version): Declare new
private method.
(corpus::set_environment): Make this non-const.
(corpus::{get,set}_format_{major,minor}_version_number): Declare
new accessors.
* src/abg-corpus.cc (corpus::init_format_version): Define new
method.
(corpus::set_environment): By default, initialize the format
version number of the corpus to the one supported by Libabigail.
(corpus::{get,set}_format_{major,minor}_version_number): Define
new accessors.
* include/abg-ir.h: Include abg-config.h to use the
abigail::config.
(environment::get_config): Declare new accessor.
* src/abg-ir.cc (environment::priv::config_): Add new data member.
(environment::get_config): Define new accessor.
* src/abg-config.cc (config::config): Bump the format
version number to "2.0".
* src/abg-corpus-priv.h
(corpus::priv::format_{major,minor}_version_number_): Add new data members.
* src/abg-reader.cc (handle_version_attribute): Define new static
function.
(read_corpus_from_input, read_corpus_group_from_input): Use it to
read the value of the "version" attribute and set the format
version number of the corpus and corpus group accordingly.
* src/abg-writer.cc (write_context::m_config): Remove the config
object because we can now get it from the environment.
(write_context::get_config): Get the config object from the
environment.
(write_translation_unit): Do not emit the version attribute on the
translation unit element anymore.
(write_version_info): Define static function.
(write_corpus, write_corpus_group): Use it to emit version
attribute on both the corpus and corpus group elements.
* tools/abidiff.cc
(emit_incomptatible_format_version_error_message): Define new
static function.
(main): Ensure that corpora and corpus groups being compared have
the same major version number.
* tests/update-test-output.py: Adjust syntax for python3.
* tests/data/test-annotate/libtest23.so.abi: Adjust.
* tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test-anonymous-members-0.o.abi:
Likewise.
* tests/data/test-annotate/test0.abi: Likewise.
* tests/data/test-annotate/test1.abi: Likewise.
* tests/data/test-annotate/test13-pr18894.so.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-annotate/test2.so.abi: Likewise.
* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-annotate/test3.so.abi: Likewise.
* tests/data/test-annotate/test4.so.abi: Likewise.
* tests/data/test-annotate/test5.o.abi: Likewise.
* tests/data/test-annotate/test6.so.abi: Likewise.
* tests/data/test-annotate/test7.so.abi: Likewise.
* tests/data/test-annotate/test8-qualified-this-pointer.so.abi:
Likewise.
* tests/data/test-diff-dwarf-abixml/PR25409-librte_bus_dpaa.so.20.0.abi:
Likewise.
* tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1-report-0.txt:
Likewise.
* tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1.abi:
Likewise.
* tests/data/test-diff-suppr/libtest48-soname-abixml-v0.so.abi:
Likewise.
* tests/data/test-diff-suppr/libtest48-soname-abixml-v1.so.abi:
Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/PR24378-fn-is-not-scope.abi:
Likewise.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Likewise.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Likewise.
* tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi:
Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-read-dwarf/test-PR26568-1.o.abi: Likewise.
* tests/data/test-read-dwarf/test-PR26568-2.o.abi: Likewise.
* tests/data/test-read-dwarf/test0.abi: Likewise.
* tests/data/test-read-dwarf/test0.hash.abi: Likewise.
* tests/data/test-read-dwarf/test1.abi: Likewise.
* tests/data/test-read-dwarf/test1.hash.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-read-dwarf/test2.so.abi: Likewise.
* tests/data/test-read-dwarf/test2.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi:
Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.
* tests/data/test-read-dwarf/test3.so.abi: Likewise.
* tests/data/test-read-dwarf/test3.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test4.so.abi: Likewise.
* tests/data/test-read-dwarf/test4.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test5.o.abi: Likewise.
* tests/data/test-read-dwarf/test5.o.hash.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test7.so.abi: Likewise.
* tests/data/test-read-dwarf/test7.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.abi:
Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.hash.abi:
Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
* tests/data/test-read-write/test0.xml: Likewise.
* tests/data/test-read-write/test1.xml: Likewise.
* tests/data/test-read-write/test10.xml: Likewise.
* tests/data/test-read-write/test11.xml: Likewise.
* tests/data/test-read-write/test12.xml: Likewise.
* tests/data/test-read-write/test13.xml: Likewise.
* tests/data/test-read-write/test14.xml: Likewise.
* tests/data/test-read-write/test15.xml: Likewise.
* tests/data/test-read-write/test16.xml: Likewise.
* tests/data/test-read-write/test17.xml: Likewise.
* tests/data/test-read-write/test18.xml: Likewise.
* tests/data/test-read-write/test19.xml: Likewise.
* tests/data/test-read-write/test2.xml: Likewise.
* tests/data/test-read-write/test20.xml: Likewise.
* tests/data/test-read-write/test21.xml: Likewise.
* tests/data/test-read-write/test22.xml: Likewise.
* tests/data/test-read-write/test23.xml: Likewise.
* tests/data/test-read-write/test24.xml: Likewise.
* tests/data/test-read-write/test25.xml: Likewise.
* tests/data/test-read-write/test26.xml: Likewise.
* tests/data/test-read-write/test27.xml: Likewise.
* tests/data/test-read-write/test28-without-std-fns-ref.xml:
Likewise.
* tests/data/test-read-write/test28-without-std-vars-ref.xml:
Likewise.
* tests/data/test-read-write/test3.xml: Likewise.
* tests/data/test-read-write/test4.xml: Likewise.
* tests/data/test-read-write/test5.xml: Likewise.
* tests/data/test-read-write/test6.xml: Likewise.
* tests/data/test-read-write/test7.xml: Likewise.
* tests/data/test-read-write/test8.xml: Likewise.
* tests/data/test-read-write/test9.xml: Likewise.
Dodji Seketeli [Tue, 20 Oct 2020 09:36:42 +0000 (11:36 +0200)]
Bug 26684 - Support DW_AT_data_bit_offset attribute
This patch adds support for the DW_AT_data_bit_offset DWARF 5
attribute. Note that this attribute has been introduced in prior
versions of DWARF, but since the version 5, it supersedes the
DW_AT_bit_offset attribute.
Note that Libabigail was wrongly interpreting the DW_AT_bit_offset
attribute. It was considering it as the offset of the bit field data
member, starting from the least significant bit of the containing
structure. That is not the case on little endian machines,
unfortunately.
So this patch fixes that mistake.
So with this patch, we can now compare a binary using DW_AT_bit_offset
against a binary using DW_AT_data_bit_offset and expect things to work
correctly.
The problem is that abixml files generated with Libabigail versions
prior to this patch contain bit field data member offset values that
are not compatible with those contained in abixml files generated with
this version, onward.
So I guess a subsequent patch is needed to introduce a new abixml
version string (maybe "2.0") that would be deemed incompatible with
the previous "1.0" one. That way, we can prevent comparing 2.0 abixml
files against 1.0 ones in abidiff, for instance.
For now, the patch adjusts the various abixml files to make them
reflect the proper representation of DW_AT_bit_offset on little endian
machines.
* src/abg-dwarf-reader.cc (read_and_convert_DW_at_bit_offset):
Define new static function.
(die_member_offset): Primarily use DW_AT_data_bit_offset if its
present. Otherwise, look for DW_AT_bit_offset. Use the new
read_and_convert_DW_at_bit_offset function to properly interpret
DW_AT_bit_offset if its present. Update comment.
* tests/data/test-diff-filter/test-PR26684-dwarf{4,5}.o: New
binary test inputs.
* tests/data/test-diff-filter/test-PR26684.c: Source code of the
new binary test inputs above.
* tests/data/test-diff-filter/test-PR26684-report-0.txt: New
reference test output.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the test inputs
above to this test harness.
* tests/data/test-annotate/test13-pr18894.so.abi: Adjust.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-annotate/test17-pr19027.so.abi: Adjust.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-diff-dwarf-abixml/PR25409-librte_bus_dpaa.so.20.0.abi:
Adjust.
* tests/data/test-diff-pkg/libcdio-0.94-1.fc26.x86_64--libcdio-0.94-2.fc26.x86_64-report.1.txt:
Adjust.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Adjust.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Adjust.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Adjust.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Adjust.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Adjust.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Adjust.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Adjust.
Dodji Seketeli [Wed, 27 Jan 2021 10:20:48 +0000 (11:20 +0100)]
Bug 27255 - fedabipkgdiff fails on nfs-utils on Fedora 33
When running fedabipkgdiff as:
fedabipkgdiff --self-compare -a --from fc33 nfs-utils
I am getting:
Error encountered while running fedabipkgdiff with error message:
Running it with the --verbose option yields more clue, though.
It turns out that fedabipkgdiff runs abipkgdiff on an RPM and gives it
the wrong associated -debuginfo RPM.
This is because the member function
RPMCollection.get_sibling_debuginfo() doesn't returns the "first"
debuginfo package that comes with a given RPM. In the case of the
package nfs-utils-2.5.2-1.rc4.fc33.aarch64.rpm, it was using the
package nfs-utils-coreos-debuginfo-2.5.2-1.rc4.fc33.aarch64.rpm
instead of the package nfs-utils-debuginfo-2.5.2-1.rc4.fc33.aarch64.rpm.
So, of course, abipkgdiff could not find the proper debuginfo for the
binaries carried by nfs-utils-2.5.2-1.rc4.fc33.aarch64.rpm.
This happens only in cases where there a several debuginfo packages
for a given RPM. In that case, we need to be more careful to select
the right debuginfo package and not just a random one.
This patch adds a RPMCollection.get_matching_debuginfo() member function
that does the right thing. It thus teaches
generate_comparison_halves() to use the new function.
* tools/fedabipkgdiff (RPMCollection::get_sibling_debuginfo):
Update comment.
(RPMCollection::get_matching_debuginfo): Define new function.
(generate_comparison_halves): Use
RPMCollection::get_matching_debuginfo instead of
RPMCollection::get_sibling_debuginfo.
Dodji Seketeli [Tue, 26 Jan 2021 18:19:01 +0000 (19:19 +0100)]
dwarf-reader: Support fast DW_FORM_line_strp string comparison
When running Libabigail on Fedora 34 it appeared that we don't support
DW_FORM_line_strp in the fast string comparison scheme, during DIE
de-duplication.
This patch fixes that.
* src/abg-dwarf-reader.cc (compare_dies_string_attribute_value):
Support DW_FORM_line_strp.
* tests/data/test-diff-pkg/sshpass-1.07-1.fc34.x86_64-self-check-report-0.txt:
New reference test output.
* tests/data/test-diff-pkg/sshpass-1.07-1.fc34.x86_64.rpm: New
test input.
* tests/data/test-diff-pkg/sshpass-debuginfo-1.07-1.fc34.x86_64.rpm:
Likewise.
* tests/data/Makefile.am: Add the new testing material above to
source distribution.
* tests/test-diff-pkg.cc (in_out_specs): Add the test input above
to this harness.
Dodji Seketeli [Tue, 26 Jan 2021 05:35:29 +0000 (06:35 +0100)]
Bug 27232 - fedabipkgdiff fails on gawk from Fedora 33
When running fedabipkgdiff on the gawk-5.1.0-2.fc33.aarch64.rpm
package we get this error message:
Error encountered while running fedabipkgdiff with error message:
That is not a very useful error message to say the least.
The issue is that abipkgdiff returns with an "unknown error" which is
due to the fact that the gawk package contains a directory that is
owned by root. As abipkgdiff tries to write temporary files into that
directory, it fails to do so.
The patch now writes the temporary ABIXML file into a sub-directory
that is not owned the package where we should have write privileges.
It also improves error reporting.
* tools/abipkgdiff.cc (options::pkg{1,2}): Add new data members to
store the packages to compare and have them available for the
various functions that may need them down the road.
(package::create_abi_file_path): Add new function.
(compare_to_self): Use the new package::create_abi_file_path to
create the path to the ABI file in a directory not owned by the
package. That should increase our chances of having the rights to
write that one. Make sure to emit error message when the
comparison against self fails.
({compare_task, self_compare_task}::perform): During the process
of comparison if an internal error happens, report it. Cleanup
the existing reporting a little bit.
(pkg_extraction_task::perform): Fix comment.
* tests/data/test-diff-pkg/libxfce4ui-devel-4.12.1-8.fc27.ppc64-self-report-0.txt:
Adjust.
When compiling with clang, it (rightfully) complains about an operator
precedence issue:
abipkgdiff.cc:1646:7: error: operator '?:' has lower precedence than '<<'; '<<' will be evaluated first [-Wparentheses]
? string("Comparison against self SUCCEEDED\n")
^
Fix that by properly placing the parentheses. Also, drop the superfluous
string conversion.
Dodji Seketeli [Tue, 26 Jan 2021 13:05:19 +0000 (14:05 +0100)]
Bug 27233 - fedabipkgdiff fails on package gnupg2 from Fedora 33
At the core of this issue, Libabigail is failing to canonicalize some
types in some cases. And that is triggering the assertion at the end
of hash_as_canonical_type_or_constant when emitting ABIXML.
It turns out read_context::canonicalize_types_scheduled in the dwarf
reader sometimes fails to canonicalize the types contained in
read_context::extra_types_to_canonicalize().
This patch fixes that.
Incidentally, this patch also fixes a previous issue where
hash_as_canonical_type_or_constant would hit that assert at its end
because of non-canonicalized function types. I am now removing the
band-aid I put in place at the time by loosening the assertion there.
* src/abg-dwarf-reader.cc
(read_context::canonicalize_types_scheduled): Don't forget to
canonicalize types stored in extra_types_to_canonicalize_.
* src/abg-ir.cc (type_base::get_canonical_type_for): Add better
comment.
(hash_as_canonical_type_or_constant): Remove crutch that is
useless now that we canonicalize almost all types in the system.
Dodji Seketeli [Mon, 25 Jan 2021 17:40:13 +0000 (18:40 +0100)]
Bug 27236 - Pointer comparison wrongly fails because of typedef change
Support we have a type struct S and a type T defined as:
typedef struct S T;
Today, Libabigail considers that the two pointers "struct S*" and "T*"
are different.
The problem is that this can cause spurious change reports (as
reported by (abidw --abidiff) in binaries where we have the same type
defined more than once, each time using a different various around
this theme.
This patch make libabigail to now consider that "struct S*" and "T*"
are equal. It does the same for references and arrays.
* src/abg-ir.cc (equals): In the overloads for pointer_type_def,
reference_type_def and array_type_def, compare the pointed-to-type
modulo typedefs.
* tests/data/test-diff-dwarf/test42-PR21296-clanggcc-report0.txt: Adjust.
* tests/data/test-diff-pkg/libICE-1.0.6-1.el6.x86_64.rpm--libICE-1.0.9-2.el7.x86_64.rpm-report-0.txt:
Adjust.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt:
Adjust.
Dodji Seketeli [Fri, 22 Jan 2021 10:24:43 +0000 (11:24 +0100)]
Ignore duplicated functions and those not associated with ELF symbols
While looking at several ABIXML files I noticed that several (member)
functions didn't have any associated ELF symbol and that in some cases,
there were even duplicated member functions. Those are the source of
some spurious changes sometimes reported by abidiff.
In DWARF the same function F can be represented several times. One
representation of F has some properties and other representations of F
might have properties that complement the properties carried by the
former representations. It's the unions of the properties of all
representations of F that constitute the properties of F.
An example of that "linked" nature is how DWARF represents inlined
functions. A function F can be inlined somewhere else. That inlined
version is called F', for instance. The DWARF representation of F'
will carry a DW_AT_abstract_origin attribute that points back to F to
signify that F' is the concrete inlined version of the abstract F.
F' will carry properties that are specific to its "inlined nature"
whereas F will carry properties that are more generic and independent
from all its various potential inlined forms.
So when Libabigail sees the DWARF representation of F, if it's
associated with an ELF symbol, then it must wait to encounter an F'
representation that is associated with an ELF symbol before adding an
internal representation (IR) of F into the final IR graph. Otherwise
the IR of F can be unnecessarily duplicated, with some instances
having an associated ELF symbol and others not.
This is what this patch does, in essence. While working on this, I
encountered some tangential issues that needed to be fixed altogether
for the whole to function. A lot of regression tests output had to be
adjusted.
In the end, a number of spurious change reports could be fixed;
notably reports about removal of destructors like STR::~STR(int).
Note how that destructor has a parameter; it's a GCC-specific
implementation detail that should not appear at this level, I believe.
* include/abg-ir.h (class_or_union::string_mem_fn_sptr_map_type):
Add a typedef for unordered_map<string, method_decl_sptr>.
(class_or_union::find_member_function_sptr): Declare new function.
* src/abg-ir.cc (class_or_union::priv::mem_fns_map_): Change the
type of this to string_mem_fn_sptr_map_type, so that shared
pointers to functions can be stored in the map, instead of bare
pointers to functions. This is useful to implement
class_or_union::find_member_function_sptr which returns a shared
pointer to function.
(class_or_union::find_member_function_sptr): Define new function.
(class_or_union::find_member_function): Adjust.
(method_decl::set_linkage_name): Use a non-deleting shared pointer
to store the current instance of member function into
class_or_union::priv::mem_fns_map_.
(hash_as_canonical_type_or_constant): As we are seeing more
function types, it appears that some function types are not
canonicalized. I am not sure why exactly, but let's loosen the
assert here for now, I'll chase the root of this later.
* src/abg-dwarf-reader.cc (finish_member_function_reading):
Improve detection of member function 'static-ness' by handling
cases where a this pointer can be const. Also support
DW_AT_object_pointer when it's present. This fixes the occurrence
of spurious change reports about loss of 'static-ness' of member
functions.
(potential_member_fn_should_be_dropped): Define new static
function and ...
(build_ir_node_from_die): ... use it here. When a function DIE
has the same linkage name as an existing function IR, do not
create a new IR for it. Rather, re-use the existing one to
complete it with the properties found on the function DIE. If a
new function doesn't seem to have an associated ELF symbol and is
not meant to be a virtual member function then drop its IR on the
floor as well.
* tests/data/test-annotate/libtest23.so.abi: Adjust.
* tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test0.abi: Likewise.
* tests/data/test-annotate/test1.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-annotate/test6.so.abi: Likewise.
* tests/data/test-annotate/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-diff-dwarf-abixml/PR25409-librte_bus_dpaa.so.20.0.abi: Likewise.
* tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1.abi: Likewise.
* tests/data/test-diff-dwarf/test0-report.txt: Likewise.
* tests/data/test-diff-dwarf/test28-vtable-changes-report-0.txt: Likewise.
* tests/data/test-diff-dwarf/test42-PR21296-clanggcc-report0.txt: Likewise.
* tests/data/test-diff-filter/test0-report.txt: Likewise.
* tests/data/test-diff-filter/test01-report.txt: Likewise.
* tests/data/test-diff-filter/test10-report.txt: Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report0.txt: Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report1.txt: Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report2.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt: Likewise.
* tests/data/test-diff-filter/test35-pr18754-no-added-syms-report-0.txt: Likewise.
* tests/data/test-diff-filter/test35-pr18754-no-added-syms-report-1.txt: Likewise.
* tests/data/test-diff-filter/test41-report-0.txt: Likewise.
* tests/data/test-diff-filter/test9-report.txt: Likewise.
* tests/data/test-diff-pkg/libsigc++-2.0-0c2a_2.4.0-1_amd64--libsigc++-2.0-0v5_2.4.1-1ubuntu2_amd64-report-0.txt: Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Likewise.
* tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-1.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-10.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-12.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-14.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-16.txt: Likewise.
* tests/data/test-diff-suppr/test24-soname-report-4.txt: Likewise.
* tests/data/test-diff-suppr/test31-report-1.txt: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise.
* tests/data/test-read-dwarf/test0.abi: Likewise.
* tests/data/test-read-dwarf/test0.hash.abi: Likewise.
* tests/data/test-read-dwarf/test1.abi: Likewise.
* tests/data/test-read-dwarf/test1.hash.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.hash.abi: Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
Dodji Seketeli [Wed, 20 Jan 2021 05:03:59 +0000 (06:03 +0100)]
Bug 27204 - potential loss of some aliased ELF function symbols
Sometimes when a symbol S' of a function F' aliases the symbol S of a function
F, the ABIXML reader might not add F' back into the set of exported
functions of the ABI corpus (as the DWARF reader has done initially).
That results in the apparent 'loss' of F' (and S') from the corpus.
This is due to the way F' is identified, using function_decl::get_id.
In the case where the symbol S' of F' has aliases,
function_decl::get_id (wrongly) uses the linkage name of F' as the
identifier. If F' and F happen to have the same linkage name and if F
is already in the set of exported functions of the corpus then F'
won't be added into that set.
To solve that problem, this patch makes function_decl::get_id
construct an ID that ensures that F and F' always have different IDs.
* src/abg-ir.cc (function_decl::get_id): If the elf symbol has
aliases, make the function name be part of the ID so that this ID
differs from the one of the other functions that share a symbol
alias with this one.
* tests/data/test-abidiff/test-PR18791-report0.txt: Adjust.
* tests/data/test-diff-dwarf/test42-PR21296-clanggcc-report0.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt: Likewise.
* tests/data/test-diff-filter/test41-report-0.txt: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-diff-pkg/glibc-2.32-3.fc33.aarch64-self-check-report-0.txt:
New test reference output.
* tests/data/test-diff-pkg/glibc-2.32-3.fc33.aarch64.rpm: New test
input RPM.
* tests/data/test-diff-pkg/glibc-debuginfo-2.32-3.fc33.aarch64.rpm:
Likewise.
* tests/data/Makefile.am: Add the new test material to source
distribution.
* tests/test-diff-pkg.cc (in_out_specs): Add the new test input
RPMs to this test harness.
abg-ir: Optimize calls to std::string::find() for a single char.
This is a common micro optimization suggested by clang-tidy to improve
string::find performance. I have not done any measurements as to how it
impacts performance for that particular piece of code, but generally
this overload is to prefer here.
* src/abg-ir.cc (elf_symbol::get_name_and_version_from_id):
use character literal overload for single character string::find.
(parse_integral_type): Likewise.
Suggested-by: Chris Kennelly <ckennelly@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Dodji Seketeli [Fri, 15 Jan 2021 14:38:44 +0000 (15:38 +0100)]
Bug 26992 - Try harder to resolve declaration-only classes
When a declaration of a class, named H, matches more than one
definition (let's call them D) of H (in several other translation
units of the abi corpus) then H is left unresolved; that is, H is
considered as being a declaration-only class. Note that down the
road, H will compare different to all those Ds.
However when those Ds are all equal, it turns out that this can lead
to issues down the road. This is because conceptually, H equals D.
But then by not resolving H to D (and there are several Ds), we
artificially create a situation where H is different from D. We can
even create situations where those Ds are different among themselves.
So doing comparisons inevitably leads to spurious changes.
This is the root cause of the issue described in this bug at
https://sourceware.org/bugzilla/show_bug.cgi?id=26992.
To fix the issue, this patch thus resolves H to D when the different
Ds are all equal.
Note that a similar thing should be done for the process of resolving
declaration-only enums as well. But I don't have an issue reproducer
at hand involving enums at the moment, so I am adding a comment to
read_context::resolve_declaration_only_enums for now.
I have also filled the enhancement request
https://sourceware.org/bugzilla/show_bug.cgi?id=27189 to track a task
that would do away with read_context::resolve_declaration_only_enums
altogether by factorizing out the resolution of declaration-only
abigail::ir::decl_base.
* src/abg-dwarf-reader.cc
(read_context::compare_before_canonicalisation): Define new member
function.
(read_context::resolve_declaration_only_classes): When there are
more than one definition that can resolve a given declaration, if
all those definitions are equal, then resolve the declaration to
those definitions.
(read_context::resolve_declaration_only_enums): Add a comment to
update similarly update this function (or do away with it
completely) later.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-diff-pkg/cogl-1.22.8-2.fc33.x86_64.rpm: Add new
test input.
* tests/data/test-diff-pkg/cogl-debuginfo-1.22.8-2.fc33.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/cogl-1.22.8-2.fc33.x86_64.self-check-report-0.txt:
Likewise.
* tests/test-diff-pkg.cc (in_out_specs): Add the new test inputs
to the test harness.
* tests/data/Makefile.am: Add the new test input files to source
distribution.
We can now use the latest upstream stable version since we bumped up our
minimum C++ standard version.
* tests/lib/catch.hpp: update to v2.13.3
* tests/test-symtab.cc (TEST_CASE("Symtab::SimpleSymtabs")): Use
the corpus variable to avoid unused variable warnings.
Giuliano Procida [Fri, 11 Dec 2020 15:03:24 +0000 (15:03 +0000)]
Refresh ABI cross check test files
The test file test0-pr19026-libvtkIOSQL-6.1.so.1 is intended to be
used to check that diffing a binary entity against its ABI
representation results in an empty diff. In this case, the ABI of the
library is also under revision control and so the test also functions
to a certain extent as check on whether the generated ABI is stable
between revisions of libabigail.
Recent changes have affected attributes and ordering of elements. The
result is that there is now a non-empty diff between the library and
the saved ABI, albeit all "harmless" changes.
This commit refreshes the revision-controlled ABI, eliminating the
differences.
* tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1.abi:
Refreshed ABI.
* tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1-report-0.txt:
File now empty.
std::unordered_set is now provided through the language standard, hence
remove the compatibility code for <unordered_set> and adjust all users
accordingly.
* include/abg-cxx-compat.h: Drop compatibility for <unordered_set>.
* include/abg-comparison.h: migrate abg_compat use to std.
* include/abg-interned-str.h: Likewise.
* include/abg-suppression.h: Likewise.
* src/abg-comparison-priv.h: Likewise.
* src/abg-dwarf-reader.cc: Likewise.
* tests/test-diff-suppr.cc: Likewise.
* tools/abipkgdiff.cc: Likewise.
std::unordered_map is now provided through the language standard, hence
remove the compatibility code for <unordered_map> and adjust all users
accordingly.
* include/abg-cxx-compat.h: Drop compatibility layer for <unordered_map>.
* include/abg-comparison.h: migrate abg_compat use to std.
* include/abg-cxx-compat.h: Likewise.
* include/abg-fwd.h: Likewise.
* include/abg-ir.h: Likewise.
* src/abg-corpus.cc: Likewise.
* src/abg-dwarf-reader.cc: Likewise.
* src/abg-ir.cc: Likewise.
* src/abg-reader.cc: Likewise.
* src/abg-writer.cc: Likewise.
std::shared_ptr, std::weak_ptr, std::dynamic_pointer_cast,
std::static_pointer_cast are now provided through the language standard,
hence remove the compatibility code for <memory> and adjust all users
accordingly.
Now with C++11 as minimum standard, we can drop the facilities required
to support earlier standards. Hence purge the use of std::tr1 from the
sources.
* include/abg-cxx-compat.h: remove compatibility with pre C++11.
* include/abg-ir.h: Remove mention of std::tr1 from comments.
* include/abg-sptr-utils.h: Likewise.
This change adds a test which exercises libabigail's handling of
qualified typedefs of arrays. The base type in the test case is an
array of pointers (chosen so we can also use restrict).
Various typedefs and (indirect) qualifications of this type are
created. In all cases, the resulting type should be an array of
qualified pointers.
However, abidiff reports things like
'const volatile void* const'
changed to
'restrict const volatile volatile void* const'
I've not attempted to check whether DWARF and ABI XML faithfully
reflect the source types. There may be trouble there as well.
For the record, these are the expected v0 types:
A = void *[7]
B = void *[7]
C = void *const[7]
D = void *const[7]
E = void *const volatile[7]
F = void *const volatile[7]
G = void *const volatile restrict[7]
H = void *const volatile restrict[7]
The v1 types should be these plus others with extra pointer
qualifiers.
* tests/data/Makefile.am: Add new test files
* tests/data/test-abidiff-exit/qualifier-typedef-array-v0.c:
New test file.
* tests/data/test-abidiff-exit/qualifier-typedef-array-v0.o:
New test file.
* tests/data/test-abidiff-exit/qualifier-typedef-array-v1.c:
New test file.
* tests/data/test-abidiff-exit/qualifier-typedef-array-v1.o:
New test file.
* tests/data/test-abidiff-exit/qualifier-typedef-array-report-0.txt:
Plain diff report.
* tests/data/test-abidiff-exit/qualifier-typedef-array-report-1.txt:
Harmless diff report.
* tests/data/test-abidiff-exit/qualifier-typedef-array-report-2.txt:
Leaf changes report.
* tests/data/test-abidiff-exit/qualifier-typedef-array-report-3.txt:
Harmless leaf changes report.
* tests/test-abidiff-exit.cc: Run new test.
ir: Arrays are indirect types for type structure similarity purposes
As described in the comments of types_have_similar_structure:
"Two indirect types have similar structure if their underlying
types are of the same kind and have the same name. [...] The size
of their underlying type does not matter"
Yet, the size of array elements (a.k.a the underlying type of an array
type) is wrongly considered to matter when assessing the "type
structure similarity" relationship for arrays.
This patch fixes that.
* src/abg-ir.cc (types_have_similar_structure): When examining
array types, always treat element types as being underlying types
of an indirect type.
* tests/data/Makefile.am: Add new test case files.
* tests/data/test-abidiff-exit/test-non-leaf-array-report.txt:
New test case showing correct --leaf-changes-only reporting.
* tests/data/test-abidiff-exit/test-non-leaf-array-v0.c:
Likewise.
* tests/data/test-abidiff-exit/test-non-leaf-array-v0.o:
Likewise.
* tests/data/test-abidiff-exit/test-non-leaf-array-v1.c:
Likewise.
* tests/data/test-abidiff-exit/test-non-leaf-array-v1.o:
Likewise.
* tests/test-abidiff-exit.cc: Run new test case.
Dodji Seketeli [Fri, 4 Dec 2020 10:27:00 +0000 (11:27 +0100)]
ir: Add better comments to types_have_similar_structure
* src/abg-ir.cc (types_have_similar_structure): Arrays are also
indirect types, just like pointers and references, for the purpose
of the concept of "type similarity". Add that to the introductory
comment of the function. Add some more misc comments throughout
the code base.
Dodji Seketeli [Thu, 3 Dec 2020 09:53:08 +0000 (10:53 +0100)]
Use C++11 for the code base
As the Enterprise Linux 6 platform has now essentially reached it's
end of life for what it's worth (the Fedora EPEL6 distribution is not
maintained anymore) nothing ties us to using C++03 only anymore.
So, I think it makes sense to move the code base to the C++11
standard.
Why C++11 and not, say, C++14 or more? Well, the more direct reason I
see is that we need to support long life cycle platforms, the older
one being Enterprise Linux 7 currently. This is the Fedora EPEL7
distribution, in concrete terms. And in that distribution, the
compiler is GCC 4.8.x. And it supports C++11.
In practise, nothing changes in the code that is already there.
The new code however can use C++11 constructs just fine.
I have updated the CONTRIBUTING file to write down some of the
unwritten cultural biases of the current code base. Hopefully these
few lines will help to shed some light on the choices made so far.
The update to that file also enacts the use of C++11 and sets some
limits to what we expects in terms of what the code base would look
like.
configure.ac is modified to unconditionally pass -std=c++11 to the
compiler and express that in the configuration text displayed at the
end of the configuration stage.
Some Makefile.am files are updated accordingly.
* CONTRIBUTING: Enact use of c++11. Also, we favor those who
read/debug/maintain the code as opposed to those who write it ;-)
* configure.ac: Switch to c++11 unconditionally.
* src/Makefile.am: Adjust.
* tests/Makefile.am: Adjust.