[RFC] Allow parallel multifile with -p -e
Jakub Jelinek
jakub@redhat.com
Fri Mar 26 16:47:38 GMT 2021
On Fri, Mar 26, 2021 at 05:40:51PM +0100, Tom de Vries wrote:
> This gives us reproducible compression:
> ...
> $ ls -la j1/*
> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j1/1
> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j1/2
> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j1/3
> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j1/4
> -rw-r--r-- 1 vries users 64543 Mar 26 17:16 j1/5
> $ ls -la j4/*
> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j4/1
> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j4/2
> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j4/3
> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j4/4
> -rw-r--r-- 1 vries users 64543 Mar 26 17:16 j4/5
> ...
>
> But it doesn't give reproducible results:
> ...
> $ md5sum j1/*
> e6e655f7b5d1078672c8b0da99ab8c41 j1/1
> e6e655f7b5d1078672c8b0da99ab8c41 j1/2
> d833aa3ad6ad35597e1b7d0635b401cf j1/3
> d833aa3ad6ad35597e1b7d0635b401cf j1/4
> d5282aa9d065f1d00fd7a46c54ebde8d j1/5
> $ md5sum j4/*
> de1645ce60bba6f345b2334825deb01f j4/1
> de1645ce60bba6f345b2334825deb01f j4/2
> ac2f16c50cf3d31be1f42f35ced4a091 j4/3
> ac2f16c50cf3d31be1f42f35ced4a091 j4/4
> 7fc3cd2c2514c8bf1f23348a27025b8d j4/5
> ...
>
> The temporary multifile section contributions happen in random
> order, so consequently the multifile layout will be different, and the
> files referring to the multifile will be different.
What I meant is that each fork should use different temporary filenames
for the multifiles, once all childs are done, merge them (depends on how
exactly is the work distributed among the forks, if e.g. for 4 forks
first fork gets first quarter of files, second second quarter etc., then
just merge them in the order, otherwise more work would be needed to make
the merging reproduceable.
Then on generate in a single process the multifile, and then again
in multiple forks work on the individual files against the multifile.
Jakub
More information about the Dwz
mailing list