[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bzip2 next steps - goals for 1.0.9
Hi all,
After the fire drills that were necessary to get the bzip2 1.0.7 "Help
a security issues!" and bzip2 1.0.8 "O, wait, maybe that was a little
too secure!" releases out I hope we will get a lot more time to do a
bzip2 1.0.9 release.
But having been forced to do two releases was also good. We got the old
website, including all old releases on sourceware.org. There is now a
git repository, also including the code for older releases. The whole
release process is now automated. Updates to the code, website and
manual are now all synchronized. We have the start of a more
comprehensive testsuite now. With buildbots on various architectures
running it on every commit. bzip2 is now part of oss-fuzz pulling from
the new git repository. And we did manage to integrate several changes
from distros and other forks/downstream back into the upstream bzip2
sources.
If we had more time (which I think we have, no rush to push out 1.0.9
quickly) then I think we want to do the following things:
- Extend the testsuite with more bz2 files that show interesting corner
cases (also as new seeds for fuzzers).
- Add more tests than just plain compress and decompress targets.
In particular I believe -f has some subtle behavior.
But it would also be good to have at least tests for all the libbz2
interfaces (some of which are not official/documented, see below).
- Add a Windows (cross) build and test (under wine) to the buildbot
as the major non-unix build that is supported.
- Provide more fuzzer targets and make them part of the upstream code
with a small wrapper so they can double as regression tests.
At least create targets for the low level BZ2_bzCompress,
BZ2_bzDecompress, high-level BZ2_bzRead (BZ2_bzReadGetUnused),
BZ2_bzWrite (BZ2_bzWriteFinish) and utility BZ2_bzBuffToBuffCompress,
BZ2_bzBuffToBuffDecompress functions in various configurations
and in compress/decompress mode to double check we can decompress
anything we compress ourselves.
- Update the manual to at least include documentation for the zlib
compatibility functions (see below). And double check it for any
other changes we made since the project moved to sourceware.
- Figure out how to produce the pdf version of the manual on more
setups. Currently it works perfectly on my RHEL7 setup, but some of
the buildbots cannot do a make dist because they don't produce a
correct pdf version (the html variant seems fine though).
- Some distros have some fixes for the man pages. Mainly symlinks for
some binaries. Look whether that should be upstreamed.
- Related, it would be good to generate the man page from the manual
again. Or find some other arrangement so that one or the other
is the main copy from which the other is generated/imported.
- We didn't pick up the Debian patch to add O_EXCL/O_CLOEXEC because
it seemed not portable (if we had a cross windows builder it would
probably have shown that the BZ_UNIX guards were incorrect). This
patch should probably be split in two.
- The O_EXCL part for bzip2recover should be easy to work exactly
like with bzip2. It would be good to not override the output
file if it already exists.
- The O_CLOEXEC one, through fopen "e" mode, part is trickier though.
It is only for the zlib compatible functions BZ2_bzopen and
BZ2_bzdopen. But those are not officially supported. Or as the
manual says: " These functions are not (yet) officially part of the
library, and are minimally documented here. If they break, you get
to keep all the pieces." And, worse, they seem to not be really
api compatible with the zlib versions. In particular the zlib
variants actually pass-through the "e" mode (!) so you don't need
to do that inside those functions themselves...
- We really should come up with a good path forward for the SONAME
mess. Currently Debian based or freedesktop SDK based distros/builds
use the current upstream libbz2.so.1.0 name. While Fedora and
openSUSE based distros will use the "sane" libbz2.so.1 name.
This means programs build against libbz2.so on those variants
don't run on the other. And so updating the upstream default
will break backward compatibility for one or the other.
I hope we can come up with an upgrade path that makes it possible
to run both binaries against a future libbz2.
My idea is simply to switch to the "sane" libbz2.so.1 name,
but also provide a wrapper library with the old libbz2.so.1.0 name.
Which might be as simply as:
gcc -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0 -lbz2
(where that -lbz2 linked against is the "sane" libbz2.so.1)
But there might be subtle issues with that I haven't thought about.
- Related, we probably also should tweak the symbol visibility so
libbz2 only exports functions/symbols that are actually supported.
- Nobody seems to really like our current build system and the Makefile
(fragments) you might have to hand edit. It is really basic and
simple and should work almost everywhere. But some of the above might
be helped with a slightly better build system. The best candidate
seems to be the autotools system that is already in use by various
distros and by Nix Packages collection (including for building cross
MinGW packages) since it was explicitly created to be integrated
upstream:
http://ftp.suse.com/pub/people/sbrabec/bzip2/README.autotools
(We will have to update the manual though, which currently says that
autotools aren't necessary, but I think just making cross compiling
to Windows easy proofs the original author was wrong, sorry Julian!)
Please let me know if I missed any other obvious goals for bzip2 1.0.9.
Cheers,
Mark