There are several issues with the abixml writer in how it handles the
process of emitting referenced types that are not directly reachable
but just walking the scopes (namespaces) of the translation units;
think about member types of a class A that are not necessarily present
in all the declarations of A, in all translation units, for instance.
This patch addresses them all because they are all intermingled.
* Use of canonical pointers in the hash map of referenced types
The abixml writer was using canonical types pointer values to hash
referenced types in a map. It was doing so "by hand"; and it was thus
messing things up for types without canonical types (like some class
declarations) etc.
This patch changes that by using the generic solution of
abigail::ir::hash_type_or_decl(), which also uses the same canonical
pointer type values. For types with no canonical types, that
functions knows has to gracefully fallback. At worst, it will just
make things slower, not wrong.
* Sorting of referenced types
The patch also changes the sorting function used for the hash map of
referenced types. The previous solution was sorting the pretty
representation of types; but then when two types have the same pretty
representation (think, typedefs, for instance) then their relative
position in the sorted result was random. This causes some stability
issues, in that emitting the abixml for the same binary several times
can lead to the some types being sorted differently -- they have the
same name, but not necessarily the same type *IDs*, as they are
different types.
The new sorting code handles this better; it also uses the pretty
representations of types, when they are equal, it uses the type IDs to
tell the types apart. At least this brings stability in the abixml
output, for a given binary.
* Avoiding duplicating declaration-only types when emitting the
context of referenced member types.
We don't keep track of declaration-only classes that are emitted.
This is because we allow a given class declaration (that carries no
definition) to appear several times in a given ABI corpus. So when a
referenced type is a class declaration, it always appears as if that
referenced type has not been emitted. So when we specifically emit
the not-emitted referenced types, it can happen that declaration-only
classes can appear a lot of times. This is unnecessary duplication,
aka bloat.
This patch thus introduces a new hash map that tracks emitted
declaration-only classes, so that we can allow duplication of class
declarations when they follow what's done in the IR read from DWARF,
and disallow that duplication when it's totally artificial and
useless.
* Better tracking of referenced types
We were blatantly forgetting to mark some referenced types as such.
So those were missing in some abixml output.
This patch fixes the spots where we were forgetting that important
information.
* Better representation of the scopes of the referenced types that
were specifically emitted.
The previous code was failing at properly representing the class scope
of some referenced types that were specifically emitted, or sometimes,
for member types, representing the scope would be so screwed that the
(referenced) member type itself wouldn't be emitted at all.
This is because I thought that to emit a given member type, just
emitting its parent scope would be enough. I thought that would
automatically trigger emitting the member type itself. First, that
would emit too much information at times; the other members of the
scope are not necessarily needed. And second the "duplication
detection code" would sometime refuse to emit the scope class, because
it has already been emitted earlier! But the incarnation that got
emitted didn't have this member type as member, then. Yes, in DWARF,
the same class A can be declared several times with different member
types in it. The complete representation of A would be a union of all
those declarations of A that are seen.
This patch addresses this issue by carefully emitting just the
information that is needed from the scope of the referenced type.
Basically the scope is declared just to declare/define the type we are
interested in; period. The abixml reader is now properly geared to
re-construct the scope by merging its different parts that are now
scattered around, in the ABI corpus. That support is part of this
patch set.
instance, a member typedef would be emitted with the information of
its parent class badly formatted.
* src/abg-writer.cc (struct type_ptr_comp_functor): Remove this.
(sort_type_ptr_map): Likewise.
(write_context::record_type_as_referenced): Do not add the
canonical type of the type to record as referenced directly.
(write_context::type_is_referenced): Adjust accordingly.
(struct write_context::type_ptr_cmp): New comparison functor.
(write_context::sort_types): New sorting function.
(write_context::{record_decl_only_type_as_emitted,
decl_only_type_is_emitted}): New member functions.
(write_member_type_opening_tag): Factorize out of ...
(write_member_type): ... here.
(write_class_decl_opening_tag): Factorize out of ...
(write_class_decl): ... here. Now, keep track also of
declaration-only classes that are emitted.
(write_decl_in_scope): Use the new write_member_type_opening_tag
and write_class_decl_opening_tag. Now write class scopes
ourselves; they only contain the type declarations that we are
emitting.
(write_translation_unit): Use the new sorting code to sort the
referenced types to emit. Do not emit referenced types that are
declaration-only classes that have already been emitted. Handle
the fact that emitting the referenced types might make those
emitted type *reference* other types too! So handle those new
referenced types as such, and emit them too.
(write_qualified_type_def, write_typedef_decl, write_var_decl): Do
not forget to mark referenced types as such.