[RFC] Allow parallel multifile with -p -e
Tom de Vries
tdevries@suse.de
Fri Mar 26 16:55:16 GMT 2021
On 3/26/21 5:47 PM, Jakub Jelinek wrote:
> On Fri, Mar 26, 2021 at 05:40:51PM +0100, Tom de Vries wrote:
>> This gives us reproducible compression:
>> ...
>> $ ls -la j1/*
>> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j1/1
>> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j1/2
>> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j1/3
>> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j1/4
>> -rw-r--r-- 1 vries users 64543 Mar 26 17:16 j1/5
>> $ ls -la j4/*
>> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j4/1
>> -rwxr-xr-x 1 vries users 11432 Mar 26 17:16 j4/2
>> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j4/3
>> -rwxr-xr-x 1 vries users 807376 Mar 26 17:16 j4/4
>> -rw-r--r-- 1 vries users 64543 Mar 26 17:16 j4/5
>> ...
>>
>> But it doesn't give reproducible results:
>> ...
>> $ md5sum j1/*
>> e6e655f7b5d1078672c8b0da99ab8c41 j1/1
>> e6e655f7b5d1078672c8b0da99ab8c41 j1/2
>> d833aa3ad6ad35597e1b7d0635b401cf j1/3
>> d833aa3ad6ad35597e1b7d0635b401cf j1/4
>> d5282aa9d065f1d00fd7a46c54ebde8d j1/5
>> $ md5sum j4/*
>> de1645ce60bba6f345b2334825deb01f j4/1
>> de1645ce60bba6f345b2334825deb01f j4/2
>> ac2f16c50cf3d31be1f42f35ced4a091 j4/3
>> ac2f16c50cf3d31be1f42f35ced4a091 j4/4
>> 7fc3cd2c2514c8bf1f23348a27025b8d j4/5
>> ...
>>
>> The temporary multifile section contributions happen in random
>> order, so consequently the multifile layout will be different, and the
>> files referring to the multifile will be different.
>
> What I meant is that each fork should use different temporary filenames
> for the multifiles, once all childs are done, merge them (depends on how
> exactly is the work distributed among the forks, if e.g. for 4 forks
> first fork gets first quarter of files, second second quarter etc., then
> just merge them in the order, otherwise more work would be needed to make
> the merging reproduceable.
Hi,
yes, I understood your comments in bugzilla. I just wanted to see how
far I got _without_ solving the reproducibility problem.
> Then on generate in a single process the multifile, and then again
> in multiple forks work on the individual files against the multifile.
Yeah, that bit I haven't gotten to yet, but that doesn't look very
difficult.
Thanks,
- Tom
More information about the Dwz
mailing list