[PATCH] use gzip compression with --no-name to create identicals steps backups
Bruno Tarquini
btarquini@gmail.com
Sat Mar 20 10:11:00 GMT 2010
Le 19/03/2010 19:14, Yann E. MORIN a écrit :
> Bruno, All,
>
Yann,
Thanks for this long reply, and sorry for my bad language.
> On Thursday 18 March 2010 17:47:27 Bruno Tarquini wrote:
>
>> use gzip compression with --no-name to create identicals steps backups.
>> So backups created from an identical directory are really identicals.
>> Calling gzip by passing '-z' to tar is not good enough: by default, gzip
>> saves the archive's mtime as metadata (--name) and preventing us to generate
>> the exact same state when no files have been modified between the two steps.
>> Later, by removing the duplicates files, it will be possible to
>> decreasing the backup directory to around 20% of it actual size.
>>
> Oh! You mean you want to (hard|sym)link the tarballs when their contents
> are the same, to save space, right?
>
That was the idea, I used fdupes + some script to convert all duplicates to
hardlinks:
./libelf/cc_core_shared_prefix_dir.tar.gz
|== size: 154, files: 42, lost: 6314
./libelf/prefix_dir.tar.gz
|== size: 1533863, files: 5, lost: 6135452
./cc_core_pass_2/cc_core_shared_prefix_dir.tar.gz
|== size: 376, files: 7, lost: 2256
./cc_core_pass_2/cc_core_static_prefix_dir.tar.gz
|== size: 6010906, files: 13, lost: 72130872
./cc_core_pass_2/prefix_dir.tar.gz
|== size: 7476474, files: 2, lost: 7476474
./libelf_target/cc_core_shared_prefix_dir.tar.gz
|== size: 6774790, files: 10, lost: 60973110
./libelf_target/prefix_dir.tar.gz
|== size: 37407896, files: 8, lost: 261855272
./elf2flt/prefix_dir.tar.gz
|== size: 6950899, files: 3, lost: 13901798
./kernel_headers/prefix_dir.tar.gz
|== size: 3607, files: 2, lost: 3607
=== Total lost size ===
422485155
> Eg. the prefix backup tarball has the same content for cc_core_pass_1 and
> for elf2flt.
>
cc_core_shared_prefix_dir.tar.gz and cc_core_static_prefix_dir.tar.gz
are goods examples: around ten duplicates of 6-7M each.
> What you have to know is that the backup tarballs for each steps are only
> for debug purposes, when it is needed to restart a build when trying to
> fix an issue in a specific step, to avoid re-building all from scratch
> every time. And when the toolchain finally got build, you get safely
> get rid of those backup tarballs! Also, rebuildign the toolchain without
> saving the tarballs will make for a clean build.log.
>
That 's what I realized after, removing duplicates after the build is
useless since all backups could simply be removed. And removing duplicates
during the build process is far too complicated for a simple debug feature
like you said. Ideally ct-NG should know about empty step and simply
clone the
previous step, but it's more changes and I see no enough reasons to do
it, excepts
to show "skipping empty step" to users :-)
> In fact, even I seldom use that feature nowadays.
>
> Just for information, here is the space it takes to store the backup
> tarballs for the armeb-unknown-linux-gnueabi sample:
>
> # du -hs src/ tarballs/ tools/ \
> armeb-unknown-linux-gnueabi/state/ \
> armeb-unknown-linux-gnueabi/build/ \
> .
> 1.5G src/
> 184M tarballs/
> 28K tools/
> 1.4G armeb-unknown-linux-gnueabi/state/
> 1.6G armeb-unknown-linux-gnueabi/build/
> 4.5G .
>
> Which means that the backup tarballs are just less than one third of the
> total space used to build the toolchain (not counting the 450MiB used by
> the toolchain itself, in which case the backup tarballs account for
> roughly 1/4th of the total space).
>
When you do minimalist build (no target tools...), you have lots of
empty steps,
so a more high duplicated backups ratio.
> Are you *that* short of space that you need this feature when *debugging*
> crosstool-NG ?
>
I have never enough space :-)
> Which basicaly boils down to: what do you want the backup tarballs for?
>
In fact, I used them to create some kind of binary packages (by diffing
each steps):
linux-headers, libc... some i came manage to import them in a the target
with a package manager.
>
>> At the same time, we pass -3 to gzip as it is said in Kbuild help.
>>
> Not needed. -3 is the default for gzip compression.
>
Well, from man gzip and:
tar czf default.tar.gz dummy && zcat default.tar.gz | gzip -3 >
gzip3.tar.gz && zcat default.tar.gz | gzip -6 > gzip6.tar.gz && ls -l *.gz
-rw-r--r-- 1 bruno users 39824 mars 20 10:30 default.tar.gz
-rw-r--r-- 1 bruno users 43084 mars 20 10:30 gzip3.tar.gz
-rw-r--r-- 1 bruno users 39824 mars 20 10:30 gzip6.tar.gz
it seems to be -6 now.
> Regards,
> Yann E. MORIN.
>
>
Regards,
Bruno.
--
For unsubscribe information see http://sourceware.org/lists.html#faq
More information about the crossgcc
mailing list