Bug 19081 - Make abipkgdiff be multi-threaded
Summary: Make abipkgdiff be multi-threaded
Status: RESOLVED FIXED
Alias: None
Product: libabigail
Classification: Unclassified
Component: default (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Ondrej Oprala
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-07 09:34 UTC by Dodji Seketeli
Modified: 2015-12-03 11:07 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
0001-Bug-19081-abipkgdiff-parallelization.patch (8.74 KB, text/x-patch)
2015-10-27 09:14 UTC, Ondrej Oprala
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dodji Seketeli 2015-10-07 09:34:38 UTC
abipkgdiff can use pthreads to run comparison on pairs of libraries (each pair contains two versions of the same library) carried by two packages, just like what test-read-dwarf.cc does already.

That could potentially greatly speed up abipkgdiff, depending on the number of physical cores the system has.
Comment 1 Dodji Seketeli 2015-10-07 10:00:24 UTC
Note that for each pair of binaries to compare in a pair of packages, the reporting phase should not be done in parallel with the reporting phases of the other pairs of binaries.

All the reporting of all pair of binaries should happen sequentially after the reading, comparison and diff categorization (the phase where the system learns about what diff report to filter out) happened in parallel mode.

So we might need to add entry points in the library to perform the diff categorization separately from the reporting itself.
Comment 2 Dodji Seketeli 2015-10-07 10:27:04 UTC
Note that the .abignore file to take in account is the one present in the latest package of the two being compared.

If the first package has an .abignore file and not the second one, then the .abignore file of the first package is ignored.

Note also that additional suppression specification files can be passed to the command line of the tool.  They are going to be taken in account in *addition* to the *.abignore file found in the second package.
Comment 3 Dodji Seketeli 2015-10-07 10:29:27 UTC
(In reply to dodji from comment #2)
> Note that the .abignore file to take in account is the one present in the
> latest package of the two being compared.
> 
> If the first package has an .abignore file and not the second one, then the
> .abignore file of the first package is ignored.
> 
> Note also that additional suppression specification files can be passed to
> the command line of the tool.  They are going to be taken in account in
> *addition* to the *.abignore file found in the second package.

Forget this comment, please.  It was meant to be for bug https://sourceware.org/bugzilla/show_bug.cgi?id=19082, not this one, sorry.
Comment 4 Ben Woodard 2015-10-07 20:05:58 UTC
Something that sounds related and might be the same thing which would be helpful in abidiff would be to make the reading of the libraries run in parallel. This could be a great optimization especially if you allowed multiple comparisons.

abidiff dir1/libfoo.so dir2/libfoo.so dir3/libfoo.so dir4/libfoo.so

The reading of the ABI could be done in parallel and then whether the comparison is done in parallel or sequentially the output would have to be serialized. However, doing it this way would need to have two variations one which would generate in effect:

abidiff dir1/libfoo.so dir2/libfoo.so
abidiff dir1/libfoo.so dir3/libfoo.so
abidiff dir1/libfoo.so dir4/libfoo.so

without having to read dir1/libfoo.so three times and the other would be more like:

abidiff --combinations dir1/libfoo.so dir2/libfoo.so dir3/libfoo.so dir4/libfoo.so

abidiff dir1/libfoo.so dir2/libfoo.so
abidiff dir1/libfoo.so dir3/libfoo.so
abidiff dir1/libfoo.so dir4/libfoo.so
abidiff dir2/libfoo.so dir3/libfoo.so
abidiff dir2/libfoo.so dir4/libfoo.so
abidiff dir3/libfoo.so dir4/libfoo.so

and would only read and process dir[1-3]/libfoo.so one time each rather than three times each.
Comment 5 Ondrej Oprala 2015-10-27 09:14:29 UTC
Created attachment 8751 [details]
0001-Bug-19081-abipkgdiff-parallelization.patch

So here's my take on this issue.
I've been using the rpms of vtk to test this throughout.
Prior to any threading attempts, running this:
abipkgdiff vtk-6.1.0-26.fc23.x86_64.rpm vtk-6.2.0-8.fc23.x86_64.rpm --d1 
vtk-debuginfo-6.1.0-26.fc23.x86_64.rpm --d2 
vtk-debuginfo-6.2.0-8.fc23.x86_64.rpm

took a bit less than 4 mins:
230,08s user 6,60s system 101% cpu 3:53,03 total

threads make the numbers much nicer \o/
381,17s user 14,15s system 492% cpu 1:20,23 total

I tried to be as verbose as possible with the commit message, but if you 
think I missed
anything anywhere, please tell me, I want this to be as good as it can get.

Helgrind only complains about elf_version (and elf_begin), which we 
already discussed on the IRC.
Didn't try testing it with anything really huge yet like Ben's libgcj etc...
Any and all comments would be very appreciated.

On 07.10.2015 11:34, dodji at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=19081
>
>              Bug ID: 19081
>             Summary: Make abipkgdiff be multi-threaded
>             Product: libabigail
>             Version: unspecified
>              Status: NEW
>            Severity: enhancement
>            Priority: P2
>           Component: default
>            Assignee: dodji at redhat dot com
>            Reporter: dodji at redhat dot com
>                  CC: libabigail at sourceware dot org
>    Target Milestone: ---
>
> abipkgdiff can use pthreads to run comparison on pairs of libraries (each pair
> contains two versions of the same library) carried by two packages, just like
> what test-read-dwarf.cc does already.
>
> That could potentially greatly speed up abipkgdiff, depending on the number of
> physical cores the system has.
>
Comment 6 Dodji Seketeli 2015-11-04 11:54:48 UTC
Thanks!

I have started to review the patch on the mailing of the project.

You can read the review at https://sourceware.org/ml/libabigail/2015-q4/msg00143.html.

Thanks again for spending time on this.
Comment 7 Dodji Seketeli 2015-11-04 11:55:59 UTC
(In reply to dodji from comment #6)

> I have started to review the patch on the mailing of the project.
> You can read the review at
> https://sourceware.org/ml/libabigail/2015-q4/msg00143.html.

So let's continue the discussion on the mailing list.

Cheers.
Comment 8 Ondrej Oprala 2015-12-03 11:07:54 UTC
Resolved by upstream commit 17b04f2e