Bug 30034

Summary: [libabigail] Handle library splitting
Product: libabigail Reporter: David Marchand <david.marchand>
Component: defaultAssignee: Dodji Seketeli <dodji>
Status: RESOLVED FIXED    
Severity: enhancement CC: fweimer, libabigail, sam, woodard
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: bz30034.tar.xz
bz30034.tar.xz

Description David Marchand 2023-01-23 13:45:04 UTC
We have been tracking ABI changes in the DPDK project (which consists of multiples libraries).
One of the DPDK libray, the "EAL", which is kind of central and which is linked by _all_ applications using DPDK, has become bloated.. and we are looking into splitting it.


Our problem is that since we compare the EAL shared library to a reference build, moving code to a new library (which EAL depends on) results in abidiff complaining about missing symbols, since they are not exposed anymore as part of the EAL scope.

It would be nice if libabigail came with a way to deal with this.
Some ideas:
- handle multiple abi-corpus in the existing tools (in our case, the splitted libraries dumps would be concatenated to a single xml file and passed to abidiff)?
- add some option to automatically follow DT_NEEDED entries?
- add another tool that would make it possible to compare a set of libraries exported interface in one go?


Note: for the DT_NEEDED idea, it might be interesting to be able to filter which DT_NEEDED entries are followed (like for example with a "--follow-dt-needed librte_" option) so that we can only monitor changes in a set of libraries (iow, in this option example, the libraries matching the regexp "librte_").
Comment 1 Dodji Seketeli 2023-06-02 17:53:24 UTC
I have started to write down the specification for this feature at https://sourceware.org/git/?p=libabigail.git;a=blob;f=README-ABIDIFF-BINARIES-SET-SUPPORT.md;hb=refs/heads/users/dodji/PR30034.

Do you think we could use that as a starting point for an initial implementation?  If not what you want to see changed?

Thanks.
Comment 2 David Marchand 2023-06-05 08:04:18 UTC
I'll need to think more and maybe test a prototype to be sure, but this spec seems to involve more scripting for libabigail users.
In our DPDK case,
- with this proposal, DPDK will have to track every splitted library and maintain sets of libraries to pass to abidiff,
- on the other hand, the "DT_NEEDED following with a name pattern" idea would be a project level option we would set in our check-abi.sh,


I only see mentions of binaries, would this "set" feature be applicable to comparing abidw dumps?
Comment 3 Dodji Seketeli 2023-06-05 11:01:28 UTC
Hello David,

"david.marchand at redhat dot com" <sourceware-bugzilla@sourceware.org>
writes:

[...]

> - on the other hand, the "DT_NEEDED following with a name pattern" idea would
> be a project level option we would set in our check-abi.sh,

Could you please elaborate on the DT_NEEDED-based scheme?

On concrete terms, could please you give me an example of how abidiff
would be invoked and how it should behave with respect to the DT_NEEDED
property?

That would be greatly appreciated.

> I only see mentions of binaries, would this "set" feature be applicable to
> comparing abidw dumps?

Yes, I should have said binaries or ABIXML files.  I'll update that.

Thanks.
Comment 4 David Marchand 2023-06-05 15:08:53 UTC
Hello Dodji,

(In reply to dodji from comment #3)
> "david.marchand at redhat dot com" <sourceware-bugzilla@sourceware.org>
> writes:
> 
> [...]
> 
> > - on the other hand, the "DT_NEEDED following with a name pattern" idea would
> > be a project level option we would set in our check-abi.sh,
> 
> Could you please elaborate on the DT_NEEDED-based scheme?
> 
> On concrete terms, could please you give me an example of how abidiff
> would be invoked and how it should behave with respect to the DT_NEEDED
> property?
> 
> That would be greatly appreciated.

What I had in mind is something that would only work with (elf) binaries and would be a bit more transparent for libabigail users.

That is a "--follow-dt-needed" option that would make the libabigail parser analyse the elf dependencies (listed in the dynamic section), then analyse the elf itself and build a representation of this set of libraries.

$ readelf -d .../builds/main/build-gcc-shared/install/usr/local/lib64/librte_eal.so | grep -i needed
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libnuma.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libfdt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libarchive.so.13]
 0x0000000000000001 (NEEDED)             Shared library: [librte_kvargs.so.23]
 0x0000000000000001 (NEEDED)             Shared library: [librte_telemetry.so.23]
 0x0000000000000001 (NEEDED)             Shared library: [libbsd.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]

Asking for a abidiff on this librte_eal.so library file would make libabigail parse libm.so.6, libnuma.so.1 etc....

There may be a concern with recursivity, because parsing the dependencies of dependencies could become troublesome and consume a lot of cpu/memory.
So the "--follow-dt-needed" option could take an optional regex to filter which dependencies are to be considered.
I am unclear whether it may be better to *require* a filter (wrt to cpu/memory consumption).


Here is an example with how we check DPDK libraries. With this new option, I would focus on librte_.*\.so.* files so invoking as:
$ abidiff --follow-dt-needed librte_ --suppr .../devtools/libabigail.abignore --no-added-syms --headers-dir1 .../abi/v23.03/build-gcc-shared/usr/local/include --headers-dir2 .../builds/main/build-gcc-shared/install/usr/local/include .../abi/v23.03/build-gcc-shared/usr/local/lib64/librte_eal.so .../builds/main/build-gcc-shared/install/usr/local/lib64/librte_eal.so
Comment 5 Ben Woodard 2023-06-05 15:24:32 UTC
I'm strongly in favor of the DT_NEEDED approach. It seems highly similar to my bug#27514 and seems to have the same basic infrastructure requirements. Also if done correctly, I believe that it should also address my bug#27208
Comment 6 Dodji Seketeli 2023-06-05 15:36:51 UTC
"david.marchand at redhat dot com" <sourceware-bugzilla@sourceware.org>
writes:

[...]

> Asking for a abidiff on this librte_eal.so library file would make libabigail
> parse libm.so.6, libnuma.so.1 etc....

Where would libabigail find those (dependant) libraries?

> There may be a concern with recursivity, because parsing the dependencies of
> dependencies could become troublesome and consume a lot of cpu/memory.

We could limit (by default) the dependant libraries to those that are
dependencies of the libraries provided on the command line.  E.g, the
dependencies of libm.so.6 would be ignored.  In other words, we won't be
looking at the transitive closure of the binaries provided on the
command line of abidiff, but rather at their direct dependencies.

> So the "--follow-dt-needed" option could take an optional regex to filter which
> dependencies are to be considered.

Yes, the suppression specification file syntax could be augmented to
take a property that filters out the dependencies to ignore while
handling the --follow-dt-needed option.  [Note that I'd rather call the
option --follow-dependencies to avoid being specific to ELF as ABIXML
does have a similar concept too].

Would that work for you?

> I am unclear whether it may be better to *require* a filter (wrt to cpu/memory
> consumption).

I don't know first hand.  That the kind of details that can be flushed
out later after a some real world testing, I guess.

> Here is an example with how we check DPDK libraries. With this new option, I
> would focus on librte_.*\.so.* files so invoking as:
> $ abidiff --follow-dt-needed librte_ --suppr .../devtools/libabigail.abignore
> --no-added-syms --headers-dir1
> .../abi/v23.03/build-gcc-shared/usr/local/include --headers-dir2
> .../builds/main/build-gcc-shared/install/usr/local/include
> .../abi/v23.03/build-gcc-shared/usr/local/lib64/librte_eal.so
> .../builds/main/build-gcc-shared/install/usr/local/lib64/librte_eal.so

So, are you sure that librte_eal.so has ALL the split libraries
(resulting from splitting the former big library into smaller ones) as
dependencies?  I mean, this really looks like a particular case of the
general problem of being able to compare sets of ABI corpora.  Sure, I
understand how this particular use case is worth supporting (and I agree
we should support it), but in the grand scheme of things, I am wondering
if supporting /just/ this particular case makes sense.
Comment 7 Ben Woodard 2023-06-05 16:53:16 UTC
(In reply to dodji from comment #6)
> "david.marchand at redhat dot com" <sourceware-bugzilla@sourceware.org>
> writes:
> 
> [...]
> 
> > Asking for a abidiff on this librte_eal.so library file would make libabigail
> > parse libm.so.6, libnuma.so.1 etc....
> 
> Where would libabigail find those (dependant) libraries?

Each ELF file optionally has a list of DT_NEEDED. So what we are asking for is for libabigail to read the DT_NEEDED entries and use logic like ld.so to find those libraries and then load them into the corpus group along with their library dependencies. This is of course a recursive operation.

In a prototype that I was doing for 27514, rather than trying to imperfectly implement ld.so's dependency resolution behavior, I reused the environment variables used by the /usr/bin/ldd script to make ld.so generate the list of dependencies for me. Then I just parsed the output. (I do believe that ld.so should have a C tools interface which would do the same thing but this currently doesn't exist in the current glibc)

If you would like, I can dig the old prototype code for this out of my git tree.

> 
> > There may be a concern with recursivity, because parsing the dependencies of
> > dependencies could become troublesome and consume a lot of cpu/memory.
> 
> We could limit (by default) the dependant libraries to those that are
> dependencies of the libraries provided on the command line.  E.g, the
> dependencies of libm.so.6 would be ignored. 

No!!!!!
Let's do the complete closure of needed libraries. 
This creates so many problems for us that we basically haven't been able to use libabigail for its intended purpose. I believe the abicompat example provides the clearest example of how counter intuitive the current behavior is.

app requires lib1 and lib2

We want to know if a new version of lib1 is compatible with app. Let's call this lib1'

to do this the expected behavior is:

abicompat app lib1 lib1'

The problem is app only makes use of a few functions in lib1. However, lib2 makes extensive use of lib1's interfaces. 

If the changes in lib1 do not affect the functions called by app then "abicompat app lib1 lib1' " will give a false sense of compatibility.
Arguably the fix for this problem is to do the recursive expansion of the needed library space outside of libabigail. In this case it would be: "abicompat lib2 lib1 lib1' " 

The problem is that some of our apps have as many as 1200 libraries. Consider: https://computing.llnl.gov/sites/default/files/2021-03/LBANN%20dependencies%20-%20small%20for%20Word.png which is one of our apps. This ends up being an almost factorial expansion. With the repeated parsing of each ELF file's DWARF or even abixml, turning this into a script takes vastly more time than if the complete closure of the used interfaces were computed inside of libabigail.
Comment 8 David Marchand 2023-06-06 07:28:29 UTC
(In reply to dodji from comment #6)
> "david.marchand at redhat dot com" <sourceware-bugzilla@sourceware.org>
> writes:
> 
> [...]
> 
> > Asking for a abidiff on this librte_eal.so library file would make libabigail
> > parse libm.so.6, libnuma.so.1 etc....
> 
> Where would libabigail find those (dependant) libraries?

I was thinking of the DT_NEEDED entries in the dynamic section of an elf binary.


> 
> > There may be a concern with recursivity, because parsing the dependencies of
> > dependencies could become troublesome and consume a lot of cpu/memory.
> 
> We could limit (by default) the dependant libraries to those that are
> dependencies of the libraries provided on the command line.  E.g, the
> dependencies of libm.so.6 would be ignored.  In other words, we won't be
> looking at the transitive closure of the binaries provided on the
> command line of abidiff, but rather at their direct dependencies.
> 
> > So the "--follow-dt-needed" option could take an optional regex to filter which
> > dependencies are to be considered.
> 
> Yes, the suppression specification file syntax could be augmented to
> take a property that filters out the dependencies to ignore while
> handling the --follow-dt-needed option.  [Note that I'd rather call the
> option --follow-dependencies to avoid being specific to ELF as ABIXML
> does have a similar concept too].
> 
> Would that work for you?

I am bad at naming and I only know elfs and dwarves creatures :-).

If you can make this available for other binary formats, then yes, a more generic name like what you propose looks better.


> 
> > I am unclear whether it may be better to *require* a filter (wrt to cpu/memory
> > consumption).
> 
> I don't know first hand.  That the kind of details that can be flushed
> out later after a some real world testing, I guess.

Yes.


> 
> > Here is an example with how we check DPDK libraries. With this new option, I
> > would focus on librte_.*\.so.* files so invoking as:
> > $ abidiff --follow-dt-needed librte_ --suppr .../devtools/libabigail.abignore
> > --no-added-syms --headers-dir1
> > .../abi/v23.03/build-gcc-shared/usr/local/include --headers-dir2
> > .../builds/main/build-gcc-shared/install/usr/local/include
> > .../abi/v23.03/build-gcc-shared/usr/local/lib64/librte_eal.so
> > .../builds/main/build-gcc-shared/install/usr/local/lib64/librte_eal.so
> 
> So, are you sure that librte_eal.so has ALL the split libraries
> (resulting from splitting the former big library into smaller ones) as
> dependencies?  I mean, this really looks like a particular case of the
> general problem of being able to compare sets of ABI corpora.  Sure, I
> understand how this particular use case is worth supporting (and I agree
> we should support it), but in the grand scheme of things, I am wondering
> if supporting /just/ this particular case makes sense.

If EAL does not have all dependencies listed, that would break ABI for applications that are linked against and only knows of this library.
Comment 9 Ben Woodard 2023-06-06 21:07:01 UTC
my original for abicompat was to add --recursive I'm no longer convinced that I like that idea. I like the --follow-dependencies or even making that the default and have --exclude-dependencies. 

Right now we support:

  FILE_TYPE_XML_TU, (formerly called FILE_TYPE_NATIVE_BI) 
  FILE_TYPE_ELF,
  FILE_TYPE_AR,
  FILE_TYPE_XML_CORPUS,
  FILE_TYPE_XML_CORPUS_GROUP,
  FILE_TYPE_RPM,
  FILE_TYPE_SRPM,
  FILE_TYPE_DEB,
  FILE_TYPE_DIR,
  FILE_TYPE_TAR

What I'm thinking is:

abidw currently either emits FILE_TYPE_XML_CORPUS or FILE_TYPE_XML_CORPUS_GROUP in the case of a kernel. 

I think it should work like this:

 * normal.o -> FILE_TYPE_XML_TU or by explicit request promotion to FILE_TYPE_XML_CORPUS or even FILE_TYPE_XML_CORPUS_GROUP
 * elffile -> FILE_TYPE_XML_CORPUS_GROUP which includes dependencies or by explicit request FILE_TYPE_XML_CORPUS without dependencies. Note that this is a change of behavior. Right now   libabigail emits a corpus and only emits FILE_TYPE_XML_CORPUS_GROUP when it given a directory and a kernel vmlinux file.
 * archive, tar, rpm, deb, dir, or multiple files -> FILE_TYPE_XML_CORPUS_GROUP

I think for completeness sake abidw should also allow some up and down conversion

 * FILE_TYPE_XML_TU can be promoted to FILE_TYPE_XML_CORPUS (with one TU) or FILE_TYPE_XML_CORPUS_GROUP (with one TU and a group size of 1)
 * FILE_TYPE_XML_CORPUS 
     * can be promoted to FILE_TYPE_XML_CORPUS_GROUP with a group size of 1
     * can be down converted to FILE_TYPE_XML_TU by specifying the TU to extract.
 * FILE_TYPE_XML_CORPUS_GROUP
     * can be down converted to FILE_TYPE_XML_CORPUS if the size of the group is 1 or by specifying the specific corpus to extract. 
     * can be down converted to FILE_TYPE_XML_TU if the size of the group is 1 and the TU to extract is specified or by specifying the corpus and the tu within it to extract.

Lastly allowing abidw to accept its own xml formats allows additional suppressions to be applied to the abi.
Comment 10 Dodji Seketeli 2023-06-09 12:29:38 UTC
> > So, are you sure that librte_eal.so has ALL the split libraries
> > (resulting from splitting the former big library into smaller ones) as
> > dependencies?  I mean, this really looks like a particular case of the
> > general problem of being able to compare sets of ABI corpora.  Sure, I
> > understand how this particular use case is worth supporting (and I agree
> > we should support it), but in the grand scheme of things, I am wondering
> > if supporting /just/ this particular case makes sense.
> 
> If EAL does not have all dependencies listed, that would break ABI for
> applications that are linked against and only knows of this library.

OK, so that is the way you have designed the splitting.  Making a front library (i.e, EAL) have all the others (resulting from the the split) as dependencies, I just wanted to be sure.  Thank you for confirming.  So I am adding this use case to the initial, generic & straightforward one.  Thank you.

So, after reading your comments and talking privately with you, David, I have updated the specification in the branch users/dodji/PR30034 at https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/PR30034.

You can navigate the changes by looking at the 3 individual commits that come on top of the level of the master branch, or you can just look at the resulting specification file which is still:


https://sourceware.org/git/?p=libabigail.git;a=blob;f=README-ABIDIFF-BINARIES-SET-SUPPORT.md;hb=refs/heads/users/dodji/PR30034

Please do let me know if I am still off by a lot here :-)

Thanks.



I
Comment 11 David Marchand 2023-06-14 12:22:27 UTC
This specification looks good to me.
The devil is in the details, so I will have to test this feature once implemented.
Comment 12 David Marchand 2023-06-23 08:30:57 UTC
To give an example on the split we are currently planning to go with:
https://patchwork.dpdk.org/project/dpdk/list/?series=28061&state=%2A&archive=both

I will attach the (reference and new) binaries to this bz.
The command we use to compare atm:

$ abidiff --suppr libabigail.abignore --no-added-syms --headers-dir1 reference/include --headers-dir2 split/include reference/lib64/librte_eal.so.23.1 split/lib64/librte_eal.so.23.2 
Functions changes summary: 17 Removed, 0 Changed, 0 Added (6 filtered out) functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

17 Removed functions:

  [D] 'function int rte_log(uint32_t, uint32_t, const char*, ...)'    {rte_log@@DPDK_23}
  [D] 'function bool rte_log_can_log(uint32_t, uint32_t)'    {rte_log_can_log@@DPDK_23}
  [D] 'function int rte_log_cur_msg_loglevel()'    {rte_log_cur_msg_loglevel@@DPDK_23}
  [D] 'function int rte_log_cur_msg_logtype()'    {rte_log_cur_msg_logtype@@DPDK_23}
  [D] 'function void rte_log_dump(FILE*)'    {rte_log_dump@@DPDK_23}
  [D] 'function uint32_t rte_log_get_global_level()'    {rte_log_get_global_level@@DPDK_23}
  [D] 'function int rte_log_get_level(uint32_t)'    {rte_log_get_level@@DPDK_23}
  [D] 'function FILE* rte_log_get_stream()'    {rte_log_get_stream@@DPDK_23}
  [D] 'function void rte_log_list_types(FILE*, const char*)'    {rte_log_list_types@@DPDK_23}
  [D] 'function int rte_log_register(const char*)'    {rte_log_register@@DPDK_23}
  [D] 'function int rte_log_register_type_and_pick_level(const char*, uint32_t)'    {rte_log_register_type_and_pick_level@@DPDK_23}
  [D] 'function void rte_log_set_global_level(uint32_t)'    {rte_log_set_global_level@@DPDK_23}
  [D] 'function int rte_log_set_level(uint32_t, uint32_t)'    {rte_log_set_level@@DPDK_23}
  [D] 'function int rte_log_set_level_pattern(const char*, uint32_t)'    {rte_log_set_level_pattern@@DPDK_23}
  [D] 'function int rte_log_set_level_regexp(const char*, uint32_t)'    {rte_log_set_level_regexp@@DPDK_23}
  [D] 'function int rte_openlog_stream(FILE*)'    {rte_openlog_stream@@DPDK_23}
  [D] 'function int rte_vlog(uint32_t, uint32_t, const char*, __va_list_tag*)'    {rte_vlog@@DPDK_23}
Comment 13 David Marchand 2023-06-23 08:32:18 UTC
Created attachment 14945 [details]
bz30034.tar.xz
Comment 14 David Marchand 2023-06-23 10:17:46 UTC
Created attachment 14946 [details]
bz30034.tar.xz

Forgot to include split and dependencies libraries.
Comment 15 Dodji Seketeli 2023-06-24 14:32:27 UTC
So, in the branch "PR30034" of the git repository at https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/PR30034, there is an initial implementation of the requested feature materialized by the patch-set which tip is at https://sourceware.org/git/?p=libabigail.git;a=commit;h=0a7614bd93e5695157b331e7faa24fadc7ef861a.


Using the tarball that you kindly provided in attachment, (thank you for that!), here is the I am getting:

$ ls
cmd.log  current-patch.txt  libabigail.abignore  reference  split

$ time /home/dodji/git/libabigail/PR30034/build/tools/abidiff --follow-dependencies --suppr libabigail.abignore --no-added-syms --headers-dir1 reference/include --headers-dir2 split/include --deps-dir1 reference/lib64 --deps-dir2 split/lib64 reference/lib64/librte_eal.so split/lib64/librte_eal.so
Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added (12 filtered out) functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
 
 
real    0m0,827s
user    0m0,765s
sys     0m0,060s 
$

So, basically, the abidiff gained a new --follow-dependencies option that makes it compare two ABI Corpus Groups, each one made of the union of the input binary and its dependencies found in directories specified by --deps-dir{1,2}.

You can have a serialized form of the ABI corpus  group considered by the tool by invoking:

$ time /home/dodji/git/libabigail/PR30034/build/tools/abidw --follow-dependencies --deps-dir split/lib64/ split/lib64/librte_eal.so > librte_eal.so.group.abi
 
real    0m0,581s
user    0m0,561s
sys     0m0,022s

$ ls
cmd.log  current-patch.txt  libabigail.abignore  librte_eal.so.group.abi  reference  split

$ 

I will let you test and play with the tree while I spit & polish it in the background.  Please do not hesitate to provide me with the feedback you see fit.

Thank you!
Comment 16 David Marchand 2023-06-25 08:17:16 UTC
(In reply to dodji from comment #15)
> I will let you test and play with the tree while I spit & polish it in the
> background.  Please do not hesitate to provide me with the feedback you see
> fit.
> 

Well, for now, I hit a compilation issue on fedora37 :

...
=====================================================================
	Libabigail: 2.4.0
=====================================================================

		Here is the configuration of the package:

    Prefix                                         : /usr/local
    Source code location                           : ..
    C Compiler                                     : gcc
    C++ Compiler		                   : g++
    GCC visibility attribute supported             : yes
    CXXFLAGS	   	     			   : -g -O2 -std=c++11
    Python					   : python3

 OPTIONAL FEATURES:
    C++ standard level                             : c++11
    libdw has the dwarf_getalt function            : yes
    Enable rpm support in abipkgdiff               : yes
    Enable rpm/zstd in abipkgdiff testing          : yes
    Enable abilint --show-type-use <type-id>       : no
    Enable self comparison debugging               : no
    Enable type canonicalization debugging         : no
    Enable propagated canonical type debugging	   : no
    Enable deb support in abipkgdiff               : no
    Enable GNU tar archive support in abipkgdiff   : yes
    Enable bash completion	                   : no
    Enable fedabipkgdiff                           : no
    Enable python 3				   : yes
    Enable CTF front-end                           : yes
    Enable BTF front-end                           : yes
    Enable running tests under Valgrind            : yes
    Enable build with -fsanitize=address    	   : no
    Enable build with -fsanitize=memory    	   : no
    Enable build with -fsanitize=thread    	   : no
    Enable build with -fsanitize=undefined  	   : no
    Generate html apidoc	                   : yes
    Generate html manual	                   : yes
...


../../src/abg-ctf-reader.cc: In member function 'virtual void abigail::ctf::reader::initialize(const std::string&, const std::vector<char**>&, bool, bool)':
../../src/abg-ctf-reader.cc:291:5: error: 'reset' was not declared in this scope
  291 |     reset(elf_path, debug_info_root_paths);
      |     ^~~~~
Comment 17 David Marchand 2023-06-25 08:37:09 UTC
Once I disable CTF support, it can compile fine.

So I ran DPDK full ABI check:

Before (w/o follow dependencies):
real	0m2,951s
user	0m22,876s
sys	0m3,724s

After (with follow dependencies):
real	0m21,659s
user	3m4,138s
sys	1m52,617s

I did not look too much into it yet, but it seems significantly heavier.
Comment 18 Dodji Seketeli 2023-06-26 10:07:33 UTC
(In reply to David Marchand from comment #16)
> (In reply to dodji from comment #15)
> > I will let you test and play with the tree while I spit & polish it in the
> > background.  Please do not hesitate to provide me with the feedback you see
> > fit.
> > 
> 
> Well, for now, I hit a compilation issue on fedora37 :
> 
> ...
> =====================================================================
> 	Libabigail: 2.4.0
> =====================================================================
> 
> 		Here is the configuration of the package:
> 
>     Prefix                                         : /usr/local
>     Source code location                           : ..
>     C Compiler                                     : gcc
>     C++ Compiler		                   : g++
>     GCC visibility attribute supported             : yes
>     CXXFLAGS	   	     			   : -g -O2 -std=c++11
>     Python					   : python3
> 
>  OPTIONAL FEATURES:
>     C++ standard level                             : c++11
>     libdw has the dwarf_getalt function            : yes
>     Enable rpm support in abipkgdiff               : yes
>     Enable rpm/zstd in abipkgdiff testing          : yes
>     Enable abilint --show-type-use <type-id>       : no
>     Enable self comparison debugging               : no
>     Enable type canonicalization debugging         : no
>     Enable propagated canonical type debugging	   : no
>     Enable deb support in abipkgdiff               : no
>     Enable GNU tar archive support in abipkgdiff   : yes
>     Enable bash completion	                   : no
>     Enable fedabipkgdiff                           : no
>     Enable python 3				   : yes
>     Enable CTF front-end                           : yes
>     Enable BTF front-end                           : yes
>     Enable running tests under Valgrind            : yes
>     Enable build with -fsanitize=address    	   : no
>     Enable build with -fsanitize=memory    	   : no
>     Enable build with -fsanitize=thread    	   : no
>     Enable build with -fsanitize=undefined  	   : no
>     Generate html apidoc	                   : yes
>     Generate html manual	                   : yes
> ...
> 
> 
> ../../src/abg-ctf-reader.cc: In member function 'virtual void
> abigail::ctf::reader::initialize(const std::string&, const
> std::vector<char**>&, bool, bool)':
> ../../src/abg-ctf-reader.cc:291:5: error: 'reset' was not declared in this
> scope
>   291 |     reset(elf_path, debug_info_root_paths);
>       |     ^~~~~

Oh, sorry about that.  I have fixed the error and updated the branch.  You shouldn't get this anymore.
Comment 19 Dodji Seketeli 2023-06-26 10:11:15 UTC
(In reply to David Marchand from comment #17)
> Once I disable CTF support, it can compile fine.
> 
> So I ran DPDK full ABI check:
> 
> Before (w/o follow dependencies):
> real	0m2,951s
> user	0m22,876s
> sys	0m3,724s
> 
> After (with follow dependencies):
> real	0m21,659s
> user	3m4,138s
> sys	1m52,617s
> 
> I did not look too much into it yet, but it seems significantly heavier.

Hmmh, I am wondering if you have *all* the dependencies of the library, including all system libraries and everything (like libc etc) in the dependencies directory specified by --deps-dir{1,2}.  If so, then it's normal that things are much heavier.

To know what dependencies are taken into account account by abidiff, I have just added a --list-dependencies option to abidiff in the branch.  Could you please provide me with the list of deps so that we have an idea?

Here is how the --list-dependencies option can be used:



        $ time /home/dodji/git/libabigail/PR30034/build/tools/abidiff --follow-dependencies --list-dependencies --suppr libabigail.abignore --no-added-syms --headers-dir1 reference/include --headers-dir2 split/include --deps-dir1 reference/lib64 --deps-dir2 split/lib64 reference/lib64/librte_eal.so split/lib64/librte_eal.so
        Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added (12 filtered out) functions
        Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
         
        /home/dodji/git/libabigail/PR30034/build/tools/.libs/lt-abidiff: dependencies of 'reference/lib64/librte_eal.so':
                reference/lib64/librte_kvargs.so.23, reference/lib64/librte_telemetry.so.23
        /home/dodji/git/libabigail/PR30034/build/tools/.libs/lt-abidiff: dependencies of 'split/lib64/librte_eal.so':
                split/lib64/librte_kvargs.so.23, split/lib64/librte_log.so.23, split/lib64/librte_telemetry.so.23
         
        real    0m0,662s
        user    0m0,638s
        sys     0m0,026s
        $

Thanks.
Comment 20 David Marchand 2023-06-26 12:13:27 UTC
(In reply to dodji from comment #18)
> > ../../src/abg-ctf-reader.cc: In member function 'virtual void
> > abigail::ctf::reader::initialize(const std::string&, const
> > std::vector<char**>&, bool, bool)':
> > ../../src/abg-ctf-reader.cc:291:5: error: 'reset' was not declared in this
> > scope
> >   291 |     reset(elf_path, debug_info_root_paths);
> >       |     ^~~~~
> 
> Oh, sorry about that.  I have fixed the error and updated the branch.  You
> shouldn't get this anymore.

Thanks, fixed.
Comment 21 David Marchand 2023-06-26 12:41:09 UTC
(In reply to dodji from comment #19)
> (In reply to David Marchand from comment #17)
> > Once I disable CTF support, it can compile fine.
> > 
> > So I ran DPDK full ABI check:
> > 
> > Before (w/o follow dependencies):
> > real	0m2,951s
> > user	0m22,876s
> > sys	0m3,724s
> > 
> > After (with follow dependencies):
> > real	0m21,659s
> > user	3m4,138s
> > sys	1m52,617s
> > 
> > I did not look too much into it yet, but it seems significantly heavier.
> 
> Hmmh, I am wondering if you have *all* the dependencies of the library,
> including all system libraries and everything (like libc etc) in the
> dependencies directory specified by --deps-dir{1,2}.  If so, then it's
> normal that things are much heavier.

I only have a DPDK installed libraries in this staging directory.


> 
> To know what dependencies are taken into account account by abidiff, I have
> just added a --list-dependencies option to abidiff in the branch.  Could you
> please provide me with the list of deps so that we have an idea?
> 
> Here is how the --list-dependencies option can be used:
> 
> 
> 
>         $ time /home/dodji/git/libabigail/PR30034/build/tools/abidiff
> --follow-dependencies --list-dependencies --suppr libabigail.abignore
> --no-added-syms --headers-dir1 reference/include --headers-dir2
> split/include --deps-dir1 reference/lib64 --deps-dir2 split/lib64
> reference/lib64/librte_eal.so split/lib64/librte_eal.so
>         Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0
> Added (12 filtered out) functions
>         Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>          
>         /home/dodji/git/libabigail/PR30034/build/tools/.libs/lt-abidiff:
> dependencies of 'reference/lib64/librte_eal.so':
>                 reference/lib64/librte_kvargs.so.23,
> reference/lib64/librte_telemetry.so.23
>         /home/dodji/git/libabigail/PR30034/build/tools/.libs/lt-abidiff:
> dependencies of 'split/lib64/librte_eal.so':
>                 split/lib64/librte_kvargs.so.23,
> split/lib64/librte_log.so.23, split/lib64/librte_telemetry.so.23
>          
>         real    0m0,662s
>         user    0m0,638s
>         sys     0m0,026s
>         $
> 
> Thanks.

And I can confirm with the --list-dependencies option that only librte_ libraries seem to be loaded (same output than above).



One thing is that I am doing the *whole* ABI check in DPDK which compares ~300 libraries.
Some of them may have multiple internal dependencies.


Looking at the net/ixgbe driver for example:

$ time .../libabigail/master/build/tools/abidiff --suppr .../devtools/libabigail.abignore --no-added-syms --headers-dir1 .../reference/usr/local/include --headers-dir2 .../split/usr/local/include --deps-dir1 .../reference/usr/local/lib64 --deps-dir2 .../split/usr/local/lib64 .../reference/usr/local/lib64/librte_net_ixgbe.so.23.1 .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 

real	0m0,133s
user	0m0,121s
sys	0m0,013s

$ time .../libabigail/master/build/tools/abidiff --suppr .../devtools/libabigail.abignore --no-added-syms --headers-dir1 .../reference/usr/local/include --headers-dir2 .../split/usr/local/include --follow-dependencies --deps-dir1 .../reference/usr/local/lib64 --deps-dir2 .../split/usr/local/lib64 .../reference/usr/local/lib64/librte_net_ixgbe.so.23.1 .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 
Functions changes summary: 0 Removed, 0 Changed (65 filtered out), 0 Added (24 filtered out) functions
Variables changes summary: 0 Removed, 0 Changed (2 filtered out), 0 Added variables

real	0m1,015s
user	0m0,641s
sys	0m0,372s

This library depends on many DPDK helpers / libs.

$ ldd .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 | grep librte_ | wc
     17      68     568
Comment 22 Dodji Seketeli 2023-06-26 13:57:37 UTC
> One thing is that I am doing the *whole* ABI check in DPDK which compares
> ~300 libraries.
> Some of them may have multiple internal dependencies.

OK, so comparing a (almost) transitive closure for those 300 libraries might result in comparing MANY more libraries, and that might explain the additional time spent.

> Looking at the net/ixgbe driver for example:
> 
> $ time .../libabigail/master/build/tools/abidiff --suppr
> .../devtools/libabigail.abignore --no-added-syms --headers-dir1
> .../reference/usr/local/include --headers-dir2 .../split/usr/local/include
> --deps-dir1 .../reference/usr/local/lib64 --deps-dir2
> .../split/usr/local/lib64
> .../reference/usr/local/lib64/librte_net_ixgbe.so.23.1
> .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 
> 
> real	0m0,133s
> user	0m0,121s
> sys	0m0,013s
> 
> $ time .../libabigail/master/build/tools/abidiff --suppr
> .../devtools/libabigail.abignore --no-added-syms --headers-dir1
> .../reference/usr/local/include --headers-dir2 .../split/usr/local/include
> --follow-dependencies --deps-dir1 .../reference/usr/local/lib64 --deps-dir2
> .../split/usr/local/lib64
> .../reference/usr/local/lib64/librte_net_ixgbe.so.23.1
> .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 
> Functions changes summary: 0 Removed, 0 Changed (65 filtered out), 0 Added
> (24 filtered out) functions
> Variables changes summary: 0 Removed, 0 Changed (2 filtered out), 0 Added
> variables
> 
> real	0m1,015s
> user	0m0,641s
> sys	0m0,372s
> 
> This library depends on many DPDK helpers / libs.
> 
> $ ldd .../split/usr/local/lib64/librte_net_ixgbe.so.23.2 | grep librte_ | wc
>      17      68     568

So that's a ~7x increase of time spent for a 17x increase in the number of libraries compared.  That seams reasonable to me, at first sight.  Am I missing something?
Comment 23 David Marchand 2023-07-06 13:38:47 UTC
For the record, before I go off for some weeks.

I tried:
$ git describe
libabigail-2.3-18-g91682eaf
$ git ll | head -3
91682eaf - (HEAD -> PR30034, origin/users/dodji/try-PR30034, origin/users/dodji/PR30034) abidiff: Add --{follow,list}-dependencies & add-binaries{1,2} support (47 minutes ago) <Dodji Seketeli>
daba3581 - abidw: Add --{follow,list}-dependencies & --add-binaries= support (47 minutes ago) <Dodji Seketeli>
a72b5046 - corpus,tools-utils: Support loading a corpus, its deps & other binaries (47 minutes ago) <Dodji Seketeli>


- You may want to fix typos (invalid -to suffix in the options) in the README file:
README-ABIDIFF-BINARIES-SET-SUPPORT.md:    $ abidiff --add-binaries-to1=file2-v1           \
README-ABIDIFF-BINARIES-SET-SUPPORT.md:              --add-binaries-to2=file2-v2,file2-v1  \

- The mandatory aspect of the --add-binaries-toX options is disturbing, but it seems to work if I pass an empty list --add-binaries-to1=

- Requiring a '=' for those --add-binariesX options is surprising as other options won't accept a =.

- Now performance / runtime wise, before the DPDK split:
* before the DPDK split, with 2.2 libabigail package from fc37:
real	0m2,233s
user	0m20,185s
sys	0m2,354s
* before the DPDK split, with libabigail binaries generated against this branch and no change in abidiff runtime:
real	0m2,414s
user	0m21,802s
sys	0m3,502s

And with the DPDK split:
* using --follow-dependencies + --add-binaries-dirX:
real	0m20,935s
user	3m4,397s
sys	1m45,828s
* using --add-binaries1= --add-binaries2=librte_log.so + --add-binaries-dirX:
real	0m2,703s
user	0m24,006s
sys	0m4,458s

Which seems acceptable to me :-).
Comment 24 Dodji Seketeli 2023-07-07 08:39:05 UTC
"david.marchand at redhat dot com via Libabigail"
<libabigail@sourceware.org> a écrit:

[...]


> --- Comment #23 from David Marchand <david.marchand at redhat dot com> ---
> For the record, before I go off for some weeks.

[...]


> - You may want to fix typos (invalid -to suffix in the options) in the README
> file:

Fixed.

> - The mandatory aspect of the --add-binaries-toX options is disturbing, but it
> seems to work if I pass an empty list --add-binaries-to1=

Fixed.  The --add-binaries1= is no longer necessary if you have
specified a --add-binaries2=, for instance.

> - Requiring a '=' for those --add-binariesX options is surprising as other
> options won't accept a =.

Fixed.  The '=' is no longer necessary.

> - Now performance / runtime wise, before the DPDK split:
> * before the DPDK split, with 2.2 libabigail package from fc37:
> real    0m2,233s
> user    0m20,185s
> sys     0m2,354s
> * before the DPDK split, with libabigail binaries generated against this branch
> and no change in abidiff runtime:
> real    0m2,414s
> user    0m21,802s
> sys     0m3,502s
>
> And with the DPDK split:
> * using --follow-dependencies + --add-binaries-dirX:
> real    0m20,935s
> user    3m4,397s
> sys     1m45,828s
> * using --add-binaries1= --add-binaries2=librte_log.so + --add-binaries-dirX:
> real    0m2,703s
> user    0m24,006s
> sys     0m4,458s
>
> Which seems acceptable to me :-).

Good to hear.

I have updated the branch with the fixes.

Thank you for testing and providing this timely and valuable feedback.
Comment 25 Dodji Seketeli 2023-07-07 11:48:31 UTC
The branch https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/PR30034 along with the patches it contains have been merged into the master branch of the git repository.

This feature should thus be available in the 2.4 version of libabigail.

Thank you very much for your valuable time and energy in collaborating towards this feature.
Comment 26 Florian Weimer 2023-07-07 13:55:35 UTC
How do you handle versioned symbols (in the GNU symbol versioning sense)? Prior to glibc 2.30, glibc rejected moved symbols based on the soname reference in the version information. See bug 24741.
Comment 27 Dodji Seketeli 2023-07-07 18:18:33 UTC
"fweimer at redhat dot com" <sourceware-bugzilla@sourceware.org> writes:

> https://sourceware.org/bugzilla/show_bug.cgi?id=30034
>
> Florian Weimer <fweimer at redhat dot com> changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |fweimer at redhat dot com
>
> --- Comment #26 from Florian Weimer <fweimer at redhat dot com> ---
> How do you handle versioned symbols (in the GNU symbol versioning sense)? Prior
> to glibc 2.30, glibc rejected moved symbols based on the soname reference in
> the version information. See bug 24741.

When comparing two corpus groups, Libabigail doesn't care if one symbol
moved from an ABI corpus to another, at least at the moment.  This is
regardless if the symbol is versioned or not.

So we shouldn't hit issues similar to #24741, unless there is a bug, of
course.

Thank you for the heads-up.