bzip2-1.0.2

author Julian Seward <jseward@acm.org>

Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)

committer Julian Seward <jseward@acm.org>

Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)
author Julian Seward <jseward@acm.org>
Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)
committer Julian Seward <jseward@acm.org>
Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)
diff --git a/CHANGES b/CHANGES

index ecaf4170ef889c5cf59b41c59f9264d691b9061d..d984395436691945db75883f37b17ef98d28e93d 100644 (file)
--- a/CHANGES
+++ b/CHANGES
@@ -134,7 +134,7 @@ Several minor bugfixes and enhancements:
  
  * Advance the version number to 1.0, so as to counteract the
    (false-in-this-case) impression some people have that programs 
-  with version numbers less than 1.0 are in someway, experimental,
+  with version numbers less than 1.0 are in some way, experimental,
    pre-release versions.
  
  * Create an initial Makefile-libbz2_so to build a shared library.
@@ -165,3 +165,89 @@ There are no functionality changes or bug fixes relative to version
  1.0.0.  This is just a documentation update + a fix for minor Win32
  build problems.  For almost everyone, upgrading from 1.0.0 to 1.0.1 is
  utterly pointless.  Don't bother.
+
+
+1.0.2
+~~~~~
+A bug fix release, addressing various minor issues which have appeared
+in the 18 or so months since 1.0.1 was released.  Most of the fixes
+are to do with file-handling or documentation bugs.  To the best of my
+knowledge, there have been no data-loss-causing bugs reported in the
+compression/decompression engine of 1.0.0 or 1.0.1.
+
+Note that this release does not improve the rather crude build system
+for Unix platforms.  The general plan here is to autoconfiscate/
+libtoolise 1.0.2 soon after release, and release the result as 1.1.0
+or perhaps 1.2.0.  That, however, is still just a plan at this point.
+
+Here are the changes in 1.0.2.  Bug-reporters and/or patch-senders in
+parentheses.
+
+* Fix an infinite segfault loop in 1.0.1 when a directory is
+  encountered in -f (force) mode.
+     (Trond Eivind Glomsrod, Nicholas Nethercote, Volker Schmidt)
+
+* Avoid double fclose() of output file on certain I/O error paths.
+     (Solar Designer)
+
+* Don't fail with internal error 1007 when fed a long stream (> 48MB)
+  of byte 251.  Also print useful message suggesting that 1007s may be
+  caused by bad memory.
+     (noticed by Juan Pedro Vallejo, fixed by me)
+
+* Fix uninitialised variable silly bug in demo prog dlltest.c.
+     (Jorj Bauer)
+
+* Remove 512-MB limitation on recovered file size for bzip2recover
+  on selected platforms which support 64-bit ints.  At the moment
+  all GCC supported platforms, and Win32.
+     (me, Alson van der Meulen)
+
+* Hard-code header byte values, to give correct operation on platforms
+  using EBCDIC as their native character set (IBM's OS/390).
+     (Leland Lucius)
+
+* Copy file access times correctly.
+     (Marty Leisner)
+
+* Add distclean and check targets to Makefile.
+     (Michael Carmack)
+
+* Parameterise use of ar and ranlib in Makefile.  Also add $(LDFLAGS).
+     (Rich Ireland, Bo Thorsen)
+
+* Pass -p (create parent dirs as needed) to mkdir during make install.
+     (Jeremy Fusco)
+
+* Dereference symlinks when copying file permissions in -f mode.
+     (Volker Schmidt)
+
+* Majorly simplify implementation of uInt64_qrm10.
+     (Bo Lindbergh)
+
+* Check the input file still exists before deleting the output one,
+  when aborting in cleanUpAndFail().
+     (Joerg Prante, Robert Linden, Matthias Krings)
+
+Also a bunch of patches courtesy of Philippe Troin, the Debian maintainer
+of bzip2:
+
+* Wrapper scripts (with manpages): bzdiff, bzgrep, bzmore.
+
+* Spelling changes and minor enhancements in bzip2.1.
+
+* Avoid race condition between creating the output file and setting its
+  interim permissions safely, by using fopen_output_safely().
+  No changes to bzip2recover since there is no issue with file
+  permissions there.
+
+* do not print senseless report with -v when compressing an empty
+  file.
+
+* bzcat -f works on non-bzip2 files.
+
+* do not try to escape shell meta-characters on unix (the shell takes
+  care of these).
+
+* added --fast and --best aliases for -1 -9 for gzip compatibility.
+
diff --git a/LICENSE b/LICENSE

index 88fa6d88a4865fee7e88a8749c93812962882483..9d4fa4379089f1275c8ae3d7fc235a6463095441 100644 (file)
--- a/LICENSE
+++ b/LICENSE
@@ -1,6 +1,6 @@
  
  This program, "bzip2" and associated library "libbzip2", are
-copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
  Redistribution and use in source and binary forms, with or without
  modification, are permitted provided that the following conditions
@@ -35,5 +35,5 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  
  Julian Seward, Cambridge, UK.
  jseward@acm.org
-bzip2/libbzip2 version 1.0 of 21 March 2000
+bzip2/libbzip2 version 1.0.2 of 30 December 2001
  
diff --git a/Makefile b/Makefile

index ab17f497957525afe10b259b01c16d155daa78da..8305235fe24c30ce638efc984094ff3263f53a6d 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -1,9 +1,20 @@
  
  SHELL=/bin/sh
+
+# To assist in cross-compiling
  CC=gcc
+AR=ar
+RANLIB=ranlib
+LDFLAGS=
+
+# Suitably paranoid flags to avoid bugs in gcc-2.7
  BIGFILES=-D_FILE_OFFSET_BITS=64
  CFLAGS=-Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES)
  
+# Where you want it installed when you do 'make install'
+PREFIX=/usr
+
+
  OBJS= blocksort.o  \
        huffman.o    \
        crctable.o   \
@@ -15,20 +26,21 @@ OBJS= blocksort.o  \
  all: libbz2.a bzip2 bzip2recover test
  
  bzip2: libbz2.a bzip2.o
-       $(CC) $(CFLAGS) -o bzip2 bzip2.o -L. -lbz2
+       $(CC) $(CFLAGS) $(LDFLAGS) -o bzip2 bzip2.o -L. -lbz2
  
  bzip2recover: bzip2recover.o
-       $(CC) $(CFLAGS) -o bzip2recover bzip2recover.o
+       $(CC) $(CFLAGS) $(LDFLAGS) -o bzip2recover bzip2recover.o
  
  libbz2.a: $(OBJS)
         rm -f libbz2.a
-       ar cq libbz2.a $(OBJS)
-       @if ( test -f /usr/bin/ranlib -o -f /bin/ranlib -o \
-               -f /usr/ccs/bin/ranlib ) ; then \
-               echo ranlib libbz2.a ; \
-               ranlib libbz2.a ; \
+       $(AR) cq libbz2.a $(OBJS)
+       @if ( test -f $(RANLIB) -o -f /usr/bin/ranlib -o \
+               -f /bin/ranlib -o -f /usr/ccs/bin/ranlib ) ; then \
+               echo $(RANLIB) libbz2.a ; \
+               $(RANLIB) libbz2.a ; \
         fi
  
+check: test
  test: bzip2
         @cat words1
         ./bzip2 -1  < sample1.ref > sample1.rb2
@@ -45,14 +57,12 @@ test: bzip2
         cmp sample3.tst sample3.ref
         @cat words3
  
-PREFIX=/usr
-
  install: bzip2 bzip2recover
-       if ( test ! -d $(PREFIX)/bin ) ; then mkdir $(PREFIX)/bin ; fi
-       if ( test ! -d $(PREFIX)/lib ) ; then mkdir $(PREFIX)/lib ; fi
-       if ( test ! -d $(PREFIX)/man ) ; then mkdir $(PREFIX)/man ; fi
-       if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir $(PREFIX)/man/man1 ; fi
-       if ( test ! -d $(PREFIX)/include ) ; then mkdir $(PREFIX)/include ; fi
+       if ( test ! -d $(PREFIX)/bin ) ; then mkdir -p $(PREFIX)/bin ; fi
+       if ( test ! -d $(PREFIX)/lib ) ; then mkdir -p $(PREFIX)/lib ; fi
+       if ( test ! -d $(PREFIX)/man ) ; then mkdir -p $(PREFIX)/man ; fi
+       if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir -p $(PREFIX)/man/man1 ; fi
+       if ( test ! -d $(PREFIX)/include ) ; then mkdir -p $(PREFIX)/include ; fi
         cp -f bzip2 $(PREFIX)/bin/bzip2
         cp -f bzip2 $(PREFIX)/bin/bunzip2
         cp -f bzip2 $(PREFIX)/bin/bzcat
@@ -67,7 +77,26 @@ install: bzip2 bzip2recover
         chmod a+r $(PREFIX)/include/bzlib.h
         cp -f libbz2.a $(PREFIX)/lib
         chmod a+r $(PREFIX)/lib/libbz2.a
+       cp -f bzgrep $(PREFIX)/bin/bzgrep
+       ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzegrep
+       ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzfgrep
+       chmod a+x $(PREFIX)/bin/bzgrep
+       cp -f bzmore $(PREFIX)/bin/bzmore
+       ln $(PREFIX)/bin/bzmore $(PREFIX)/bin/bzless
+       chmod a+x $(PREFIX)/bin/bzmore
+       cp -f bzdiff $(PREFIX)/bin/bzdiff
+       ln $(PREFIX)/bin/bzdiff $(PREFIX)/bin/bzcmp
+       chmod a+x $(PREFIX)/bin/bzdiff
+       cp -f bzgrep.1 bzmore.1 bzdiff.1 $(PREFIX)/man/man1
+       chmod a+r $(PREFIX)/man/man1/bzgrep.1
+       chmod a+r $(PREFIX)/man/man1/bzmore.1
+       chmod a+r $(PREFIX)/man/man1/bzdiff.1
+       echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzegrep.1
+       echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzfgrep.1
+       echo ".so man1/bzmore.1" > $(PREFIX)/man/man1/bzless.1
+       echo ".so man1/bzdiff.1" > $(PREFIX)/man/man1/bzcmp.1
  
+distclean: clean
  clean: 
         rm -f *.o libbz2.a bzip2 bzip2recover \
         sample1.rb2 sample2.rb2 sample3.rb2 \
@@ -93,7 +122,7 @@ bzip2.o: bzip2.c
  bzip2recover.o: bzip2recover.c
         $(CC) $(CFLAGS) -c bzip2recover.c
  
-DISTNAME=bzip2-1.0.1
+DISTNAME=bzip2-1.0.2
  tarfile:
         rm -f $(DISTNAME)
         ln -sf . $(DISTNAME)
@@ -112,6 +141,7 @@ tarfile:
            $(DISTNAME)/Makefile \
            $(DISTNAME)/manual.texi \
            $(DISTNAME)/manual.ps \
+          $(DISTNAME)/manual.pdf \
            $(DISTNAME)/LICENSE \
            $(DISTNAME)/bzip2.1 \
            $(DISTNAME)/bzip2.1.preformatted \
@@ -138,4 +168,25 @@ tarfile:
            $(DISTNAME)/Y2K_INFO \
            $(DISTNAME)/unzcrash.c \
            $(DISTNAME)/spewG.c \
+          $(DISTNAME)/mk251.c \
+          $(DISTNAME)/bzdiff \
+          $(DISTNAME)/bzdiff.1 \
+          $(DISTNAME)/bzmore \
+          $(DISTNAME)/bzmore.1 \
+          $(DISTNAME)/bzgrep \
+          $(DISTNAME)/bzgrep.1 \
            $(DISTNAME)/Makefile-libbz2_so
+       gzip -v $(DISTNAME).tar
+
+# For rebuilding the manual from sources on my RedHat 7.2 box
+manual: manual.ps manual.pdf manual.html
+
+manual.ps: manual.texi
+       tex manual.texi
+       dvips -o manual.ps manual.dvi
+
+manual.pdf: manual.ps
+       ps2pdf manual.ps
+
+manual.html: manual.texi
+       texi2html -split_chapter manual.texi
diff --git a/Makefile-libbz2_so b/Makefile-libbz2_so

index a347c50e9b28cfa83e4a33a26bd379b4b734d54c..4986fe2ad8a7a282c88f9aafecd2a369f73f9a96 100644 (file)
--- a/Makefile-libbz2_so
+++ b/Makefile-libbz2_so
@@ -1,8 +1,9 @@
  
  # This Makefile builds a shared version of the library, 
-# libbz2.so.1.0.1, with soname libbz2.so.1.0,
-# at least on x86-Linux (RedHat 5.2), 
-# with gcc-2.7.2.3.  Please see the README file for some 
+# libbz2.so.1.0.2, with soname libbz2.so.1.0,
+# at least on x86-Linux (RedHat 7.2), 
+# with gcc-2.96 20000731 (Red Hat Linux 7.1 2.96-98).  
+# Please see the README file for some 
  # important info about building the library like this.
  
  SHELL=/bin/sh
@@ -19,13 +20,13 @@ OBJS= blocksort.o  \
        bzlib.o
  
  all: $(OBJS)
-       $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
-       $(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.1
+       $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
+       $(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.2
         rm -f libbz2.so.1.0
-       ln -s libbz2.so.1.0.1 libbz2.so.1.0
+       ln -s libbz2.so.1.0.2 libbz2.so.1.0
  
  clean: 
-       rm -f $(OBJS) bzip2.o libbz2.so.1.0.1 libbz2.so.1.0 bzip2-shared
+       rm -f $(OBJS) bzip2.o libbz2.so.1.0.2 libbz2.so.1.0 bzip2-shared
  
  blocksort.o: blocksort.c
         $(CC) $(CFLAGS) -c blocksort.c
diff --git a/README b/README

index 22945a256c78009ccac152f2fda9a3e6562d84d5..07505d8f3d6883bbce25519c3b6d545a30882fc9 100644 (file)
--- a/README
+++ b/README
@@ -1,15 +1,15 @@
  
  This is the README for bzip2, a block-sorting file compressor, version
-1.0.  This version is fully compatible with the previous public
-releases, bzip2-0.1pl2, bzip2-0.9.0 and bzip2-0.9.5.
+1.0.2.  This version is fully compatible with the previous public
+releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1.
  
-bzip2-1.0 is distributed under a BSD-style license.  For details,
+bzip2-1.0.2 is distributed under a BSD-style license.  For details,
  see the file LICENSE.
  
-Complete documentation is available in Postscript form (manual.ps) or
-html (manual_toc.html).  A plain-text version of the manual page is
-available as bzip2.txt.  A statement about Y2K issues is now included
-in the file Y2K_INFO.
+Complete documentation is available in Postscript form (manual.ps),
+PDF (manual.pdf, amazingly enough) or html (manual_toc.html).  A
+plain-text version of the manual page is available as bzip2.txt.  
+A statement about Y2K issues is now included in the file Y2K_INFO.
  
  
  HOW TO BUILD -- UNIX
@@ -33,34 +33,41 @@ not actually execute them.
  HOW TO BUILD -- UNIX, shared library libbz2.so.
  
  Do 'make -f Makefile-libbz2_so'.  This Makefile seems to work for
-Linux-ELF (RedHat 5.2 on an x86 box), with gcc.  I make no claims
+Linux-ELF (RedHat 7.2 on an x86 box), with gcc.  I make no claims
  that it works for any other platform, though I suspect it probably
  will work for most platforms employing both ELF and gcc.
  
-bzip2-shared, a client of the shared library, is also build, but
-not self-tested.  So I suggest you also build using the normal
-Makefile, since that conducts a self-test.
+bzip2-shared, a client of the shared library, is also built, but not
+self-tested.  So I suggest you also build using the normal Makefile,
+since that conducts a self-test.  A second reason to prefer the
+version statically linked to the library is that, on x86 platforms,
+building shared objects makes a valuable register (%ebx) unavailable
+to gcc, resulting in a slowdown of 10%-20%, at least for bzip2.
  
-Important note for people upgrading .so's from 0.9.0/0.9.5 to
-version 1.0.  All the functions in the library have been renamed,
-from (eg) bzCompress to BZ2_bzCompress, to avoid namespace pollution.
+Important note for people upgrading .so's from 0.9.0/0.9.5 to version
+1.0.X.  All the functions in the library have been renamed, from (eg)
+bzCompress to BZ2_bzCompress, to avoid namespace pollution.
  Unfortunately this means that the libbz2.so created by
-Makefile-libbz2_so will not work with any program which used an
-older version of the library.  Sorry.  I do encourage library
-clients to make the effort to upgrade to use version 1.0, since
-it is both faster and more robust than previous versions.
+Makefile-libbz2_so will not work with any program which used an older
+version of the library.  Sorry.  I do encourage library clients to
+make the effort to upgrade to use version 1.0, since it is both faster
+and more robust than previous versions.
  
  
  HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
  
  It's difficult for me to support compilation on all these platforms.
  My approach is to collect binaries for these platforms, and put them
-on the master web page (http://sourceware.cygnus.com/bzip2).  Look
-there.  However (FWIW), bzip2-1.0 is very standard ANSI C and should
-compile unmodified with MS Visual C.  For Win32, there is one
-important caveat: in bzip2.c, you must set BZ_UNIX to 0 and
-BZ_LCCWIN32 to 1 before building.  If you have difficulties building,
-you might want to read README.COMPILATION.PROBLEMS.
+on the master web page (http://sources.redhat.com/bzip2).  Look there.
+However (FWIW), bzip2-1.0.X is very standard ANSI C and should compile
+unmodified with MS Visual C.  If you have difficulties building, you
+might want to read README.COMPILATION.PROBLEMS.
+
+At least using MS Visual C++ 6, you can build from the unmodified
+sources by issuing, in a command shell: 
+   nmake -f makefile.msc
+(you may need to first run the MSVC-provided script VCVARS32.BAT
+ so as to set up paths to the MSVC tools correctly).
  
  
  VALIDATION
@@ -138,29 +145,37 @@ WHAT'S NEW IN 0.9.5 ?
     * Many small improvements in file and flag handling.
     * A Y2K statement.
  
-WHAT'S NEW IN 1.0
+WHAT'S NEW IN 1.0.0 ?
  
     See the CHANGES file.
  
+WHAT'S NEW IN 1.0.2 ?
+
+   See the CHANGES file.
+
+
  I hope you find bzip2 useful.  Feel free to contact me at
     jseward@acm.org
  if you have any suggestions or queries.  Many people mailed me with
  comments, suggestions and patches after the releases of bzip-0.15,
-bzip-0.21, bzip2-0.1pl2 and bzip2-0.9.0, and the changes in bzip2 are
-largely a result of this feedback.  I thank you for your comments.
+bzip-0.21, and bzip2 versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
+and the changes in bzip2 are largely a result of this feedback.
+I thank you for your comments.
  
  At least for the time being, bzip2's "home" is (or can be reached via)
-http://www.muraroa.demon.co.uk.
+http://sources.redhat.com/bzip2.
  
  Julian Seward
  jseward@acm.org
  
-Cambridge, UK
-18   July 1996 (version 0.15)
-25 August 1996 (version 0.21)
- 7 August 1997 (bzip2, version 0.1)
-29 August 1997 (bzip2, version 0.1pl2)
-23 August 1998 (bzip2, version 0.9.0)
- 8   June 1999 (bzip2, version 0.9.5)
- 4   Sept 1999 (bzip2, version 0.9.5d)
- 5    May 2000 (bzip2, version 1.0pre8)
+Cambridge, UK (and what a great town this is!)
+
+18     July 1996 (version 0.15)
+25   August 1996 (version 0.21)
+ 7   August 1997 (bzip2, version 0.1)
+29   August 1997 (bzip2, version 0.1pl2)
+23   August 1998 (bzip2, version 0.9.0)
+ 8     June 1999 (bzip2, version 0.9.5)
+ 4     Sept 1999 (bzip2, version 0.9.5d)
+ 5      May 2000 (bzip2, version 1.0pre8)
+30 December 2001 (bzip2, version 1.0.2pre1)
+\ No newline at end of file
diff --git a/README.COMPILATION.PROBLEMS b/README.COMPILATION.PROBLEMS

index d621ad59756c87ad2bcc4f78f01ce142d7fbc512..bd1822dffbd0933231b58433d1b2b62c7bffec8c 100644 (file)
--- a/README.COMPILATION.PROBLEMS
+++ b/README.COMPILATION.PROBLEMS
@@ -117,11 +117,11 @@ Known problems as of 1.0pre8:
    All that said: you might be able to get somewhere
    by finding the line in Makefile-libbz2_so which says
  
-  $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
+  $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
  
    and replacing with 
  
-  ($CC) -G -shared -o libbz2.so.1.0.1 -h libbz2.so.1.0 $(OBJS)
+  $(CC) -G -shared -o libbz2.so.1.0.2 -h libbz2.so.1.0 $(OBJS)
    
    If gcc objects to the combination -fpic -fPIC, get rid of
    the second one, leaving just "-fpic".
diff --git a/blocksort.c b/blocksort.c

index ec426725b1e2cab47d2f43b928f70b3bd9340fdc..aba3efcd3121779391abe933579115f2e78ddab5 100644 (file)
--- a/blocksort.c
+++ b/blocksort.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -981,7 +981,14 @@ void mainSort ( UInt32* ptr,
           }
        }
  
-      AssertH ( copyStart[ss]-1 == copyEnd[ss], 1007 );
+      AssertH ( (copyStart[ss]-1 == copyEnd[ss])
+                || 
+                /* Extremely rare case missing in bzip2-1.0.0 and 1.0.1.
+                   Necessity for this case is demonstrated by compressing 
+                   a sequence of approximately 48.5 million of character 
+                   251; 1.0.0/1.0.1 will then die here. */
+                (copyStart[ss] == 0 && copyEnd[ss] == nblock-1),
+                1007 )
  
        for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
  
diff --git a/bzdiff b/bzdiff

new file mode 100644 (file)

index 0000000..3c2eb85
--- /dev/null
+++ b/bzdiff
@@ -0,0 +1,76 @@
+#!/bin/sh
+# sh is buggy on RS/6000 AIX 3.2. Replace above line with #!/bin/ksh
+
+# Bzcmp/diff wrapped for bzip2, 
+# adapted from zdiff by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
+
+# Bzcmp and bzdiff are used to invoke the cmp or the  diff  pro-
+# gram  on compressed files.  All options specified are passed
+# directly to cmp or diff.  If only 1 file is specified,  then
+# the  files  compared  are file1 and an uncompressed file1.gz.
+# If two files are specified, then they are  uncompressed  (if
+# necessary) and fed to cmp or diff.  The exit status from cmp
+# or diff is preserved.
+
+PATH="/usr/bin:$PATH"; export PATH
+prog=`echo $0 | sed 's|.*/||'`
+case "$prog" in
+  *cmp) comp=${CMP-cmp}   ;;
+  *)    comp=${DIFF-diff} ;;
+esac
+
+OPTIONS=
+FILES=
+for ARG
+do
+    case "$ARG" in
+    -*)        OPTIONS="$OPTIONS $ARG";;
+     *)        if test -f "$ARG"; then
+            FILES="$FILES $ARG"
+        else
+            echo "${prog}: $ARG not found or not a regular file"
+           exit 1
+        fi ;;
+    esac
+done
+if test -z "$FILES"; then
+       echo "Usage: $prog [${comp}_options] file [file]"
+       exit 1
+fi
+tmp=`tempfile -d /tmp -p bz` || {
+      echo 'cannot create a temporary file' >&2
+      exit 1
+}
+set $FILES
+if test $# -eq 1; then
+       FILE=`echo "$1" | sed 's/.bz2$//'`
+       bzip2 -cd "$FILE.bz2" | $comp $OPTIONS - "$FILE"
+       STAT="$?"
+
+elif test $# -eq 2; then
+       case "$1" in
+        *.bz2)
+                case "$2" in
+               *.bz2)
+                       F=`echo "$2" | sed 's|.*/||;s|.bz2$||'`
+                        bzip2 -cdfq "$2" > $tmp
+                        bzip2 -cdfq "$1" | $comp $OPTIONS - $tmp
+                        STAT="$?"
+                       /bin/rm -f $tmp;;
+
+                *)      bzip2 -cdfq "$1" | $comp $OPTIONS - "$2"
+                        STAT="$?";;
+                esac;;
+        *)      case "$2" in
+               *.bz2)
+                        bzip2 -cdfq "$2" | $comp $OPTIONS "$1" -
+                        STAT="$?";;
+                *)      $comp $OPTIONS "$1" "$2"
+                        STAT="$?";;
+                esac;;
+       esac
+        exit "$STAT"
+else
+       echo "Usage: $prog [${comp}_options] file [file]"
+       exit 1
+fi
diff --git a/bzdiff.1 b/bzdiff.1

new file mode 100644 (file)

index 0000000..adb7a8e
--- /dev/null
+++ b/bzdiff.1
@@ -0,0 +1,47 @@
+\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
+\"for Debian GNU/Linux
+.TH BZDIFF 1
+.SH NAME
+bzcmp, bzdiff \- compare bzip2 compressed files
+.SH SYNOPSIS
+.B bzcmp
+[ cmp_options ] file1
+[ file2 ]
+.br
+.B bzdiff
+[ diff_options ] file1
+[ file2 ]
+.SH DESCRIPTION
+.I  Bzcmp
+and 
+.I bzdiff
+are used to invoke the
+.I cmp
+or the
+.I diff
+program on bzip2 compressed files.  All options specified are passed
+directly to
+.I cmp
+or
+.IR diff "."
+If only 1 file is specified, then the files compared are
+.I file1
+and an uncompressed
+.IR file1 ".bz2."
+If two files are specified, then they are uncompressed if necessary and fed to
+.I cmp
+or
+.IR diff "."
+The exit status from 
+.I cmp
+or
+.I diff
+is preserved.
+.SH "SEE ALSO"
+cmp(1), diff(1), bzmore(1), bzless(1), bzgrep(1), bzip2(1)
+.SH BUGS
+Messages from the
+.I cmp
+or
+.I diff
+programs refer to temporary filenames instead of those specified.
diff --git a/bzgrep b/bzgrep

new file mode 100644 (file)

index 0000000..dbfc00e
--- /dev/null
+++ b/bzgrep
@@ -0,0 +1,71 @@
+#!/bin/sh
+
+# Bzgrep wrapped for bzip2, 
+# adapted from zgrep by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
+## zgrep notice:
+## zgrep -- a wrapper around a grep program that decompresses files as needed
+## Adapted from a version sent by Charles Levert <charles@comm.polymtl.ca>
+
+PATH="/usr/bin:$PATH"; export PATH
+
+prog=`echo $0 | sed 's|.*/||'`
+case "$prog" in
+       *egrep) grep=${EGREP-egrep}     ;;
+       *fgrep) grep=${FGREP-fgrep}     ;;
+       *)      grep=${GREP-grep}       ;;
+esac
+pat=""
+while test $# -ne 0; do
+  case "$1" in
+  -e | -f) opt="$opt $1"; shift; pat="$1"
+           if test "$grep" = grep; then  # grep is buggy with -e on SVR4
+             grep=egrep
+           fi;;
+  -A | -B) opt="$opt $1 $2"; shift;;
+  -*)     opt="$opt $1";;
+   *)      if test -z "$pat"; then
+            pat="$1"
+          else
+            break;
+           fi;;
+  esac
+  shift
+done
+
+if test -z "$pat"; then
+  echo "grep through bzip2 files"
+  echo "usage: $prog [grep_options] pattern [files]"
+  exit 1
+fi
+
+list=0
+silent=0
+op=`echo "$opt" | sed -e 's/ //g' -e 's/-//g'`
+case "$op" in
+  *l*) list=1
+esac
+case "$op" in
+  *h*) silent=1
+esac
+
+if test $# -eq 0; then
+  bzip2 -cdfq | $grep $opt "$pat"
+  exit $?
+fi
+
+res=0
+for i do
+  if test -f "$i"; then :; else if test -f "$i.bz2"; then i="$i.bz2"; fi; fi
+  if test $list -eq 1; then
+    bzip2 -cdfq "$i" | $grep $opt "$pat" 2>&1 > /dev/null && echo $i
+    r=$?
+  elif test $# -eq 1 -o $silent -eq 1; then
+    bzip2 -cdfq "$i" | $grep $opt "$pat"
+    r=$?
+  else
+    bzip2 -cdfq "$i" | $grep $opt "$pat" | sed "s|^|${i}:|"
+    r=$?
+  fi
+  test "$r" -ne 0 && res="$r"
+done
+exit $res
diff --git a/bzgrep.1 b/bzgrep.1

new file mode 100644 (file)

index 0000000..930af8c
--- /dev/null
+++ b/bzgrep.1
@@ -0,0 +1,56 @@
+\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
+\"for Debian GNU/Linux
+.TH BZGREP 1
+.SH NAME
+bzgrep, bzfgrep, bzegrep \- search possibly bzip2 compressed files for a regular expression
+.SH SYNOPSIS
+.B bzgrep
+[ grep_options ]
+.BI  [\ -e\ ] " pattern"
+.IR filename ".\|.\|."
+.br
+.B bzegrep
+[ egrep_options ]
+.BI  [\ -e\ ] " pattern"
+.IR filename ".\|.\|."
+.br
+.B bzfgrep
+[ fgrep_options ]
+.BI  [\ -e\ ] " pattern"
+.IR filename ".\|.\|."
+.SH DESCRIPTION
+.IR  Bzgrep
+is used to invoke the
+.I grep
+on bzip2-compressed files. All options specified are passed directly to
+.I grep.
+If no file is specified, then the standard input is decompressed
+if necessary and fed to grep.
+Otherwise the given files are uncompressed if necessary and fed to
+.I grep.
+.PP
+If
+.I bzgrep
+is invoked as
+.I bzegrep
+or
+.I bzfgrep
+then
+.I egrep
+or
+.I fgrep
+is used instead of
+.I grep.
+If the GREP environment variable is set,
+.I bzgrep
+uses it as the
+.I grep
+program to be invoked. For example:
+
+    for sh:  GREP=fgrep  bzgrep string files
+    for csh: (setenv GREP fgrep; bzgrep string files)
+.SH AUTHOR
+Charles Levert (charles@comm.polymtl.ca). Adapted to bzip2 by Philippe
+Troin <phil@fifi.org> for Debian GNU/Linux.
+.SH "SEE ALSO"
+grep(1), egrep(1), fgrep(1), bzdiff(1), bzmore(1), bzless(1), bzip2(1)
diff --git a/bzip2.1 b/bzip2.1

index 7de54a0118c3a4c3db1aeb34bb926565a74c2828..623435c24277c6d6b5c51653ea64b7936a34d903 100644 (file)
--- a/bzip2.1
+++ b/bzip2.1
@@ -1,7 +1,7 @@
  .PU
  .TH bzip2 1
  .SH NAME
-bzip2, bunzip2 \- a block-sorting file compressor, v1.0
+bzip2, bunzip2 \- a block-sorting file compressor, v1.0.2
  .br
  bzcat \- decompresses files to stdout
  .br
@@ -197,7 +197,7 @@ to decompress.
  .TP
  .B \-z --compress
  The complement to \-d: forces compression, regardless of the
-invokation name.
+invocation name.
  .TP
  .B \-t --test
  Check integrity of the specified file(s), but don't decompress them.
@@ -211,6 +211,10 @@ existing output files.  Also forces
  .I bzip2 
  to break hard links
  to files, which it otherwise wouldn't do.
+
+bzip2 normally declines to decompress files which don't have the
+correct magic header bytes.  If forced (-f), however, it will pass
+such files through unmodified.  This is how GNU gzip behaves.
  .TP
  .B \-k --keep
  Keep (don't delete) input files during compression
@@ -239,9 +243,13 @@ information which is primarily of interest for diagnostic purposes.
  .B \-L --license -V --version
  Display the software version, license terms and conditions.
  .TP
-.B \-1 to \-9
+.B \-1 (or \-\-fast) to \-9 (or \-\-best)
  Set the block size to 100 k, 200 k ..  900 k when compressing.  Has no
  effect when decompressing.  See MEMORY MANAGEMENT below.
+The \-\-fast and \-\-best aliases are primarily for GNU gzip 
+compatibility.  In particular, \-\-fast doesn't make things
+significantly faster.  
+And \-\-best merely selects the default behaviour.
  .TP
  .B \--
  Treats all subsequent arguments as file names, even if they start
@@ -352,11 +360,11 @@ undamaged.
  
  .I bzip2recover
  takes a single argument, the name of the damaged file, 
-and writes a number of files "rec0001file.bz2",
-"rec0002file.bz2", etc, containing the  extracted  blocks.
+and writes a number of files "rec00001file.bz2",
+"rec00002file.bz2", etc, containing the  extracted  blocks.
  The  output  filenames  are  designed  so  that the use of
  wildcards in subsequent processing -- for example,  
-"bzip2 -dc  rec*file.bz2 > recovered_data" -- lists the files in
+"bzip2 -dc  rec*file.bz2 > recovered_data" -- processes the files in
  the correct order.
  
  .I bzip2recover
@@ -397,27 +405,31 @@ I/O error messages are not as helpful as they could be.
  tries hard to detect I/O errors and exit cleanly, but the details of
  what the problem is sometimes seem rather misleading.
  
-This manual page pertains to version 1.0 of
+This manual page pertains to version 1.0.2 of
  .I bzip2.  
-Compressed
-data created by this version is entirely forwards and backwards
-compatible with the previous public releases, versions 0.1pl2, 0.9.0
-and 0.9.5,
-but with the following exception: 0.9.0 and above can correctly
-decompress multiple concatenated compressed files.  0.1pl2 cannot do
-this; it will stop after decompressing just the first file in the
-stream.
+Compressed data created by this version is entirely forwards and
+backwards compatible with the previous public releases, versions
+0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1, but with the following
+exception: 0.9.0 and above can correctly decompress multiple
+concatenated compressed files.  0.1pl2 cannot do this; it will stop
+after decompressing just the first file in the stream.
  
  .I bzip2recover
-uses 32-bit integers to represent bit positions in
-compressed files, so it cannot handle compressed files more than 512
-megabytes long.  This could easily be fixed.
+versions prior to this one, 1.0.2, used 32-bit integers to represent
+bit positions in compressed files, so it could not handle compressed
+files more than 512 megabytes long.  Version 1.0.2 and above uses
+64-bit ints on some platforms which support them (GNU supported
+targets, and Windows).  To establish whether or not bzip2recover was
+built with such a limitation, run it without arguments.  In any event
+you can build yourself an unlimited version if you can recompile it
+with MaybeUInt64 set to be an unsigned 64-bit integer.
+
+
  
  .SH AUTHOR
  Julian Seward, jseward@acm.org.
  
-http://sourceware.cygnus.com/bzip2
-http://www.muraroa.demon.co.uk
+http://sources.redhat.com/bzip2
  
  The ideas embodied in
  .I bzip2
@@ -434,6 +446,8 @@ indebted for their help, support and advice.  See the manual in the
  source distribution for pointers to sources of documentation.  Christian
  von Roques encouraged me to look for faster sorting algorithms, so as to
  speed up compression.  Bela Lubkin encouraged me to improve the
-worst-case compression performance.  Many people sent patches, helped
+worst-case compression performance.  
+The bz* scripts are derived from those of GNU gzip.
+Many people sent patches, helped
  with portability problems, lent machines, gave advice and were generally
  helpful.
diff --git a/bzip2.1.preformatted b/bzip2.1.preformatted

index 9f18339e926873b275fd1b16622d1cc0887b9873..0f20cb5a2b25f6f115eb264a260774e6e0a7ace9 100644 (file)
--- a/bzip2.1.preformatted
+++ b/bzip2.1.preformatted
@@ -1,11 +1,9 @@
-
-
-
  bzip2(1)                                                 bzip2(1)
  
  
+
  N\bNA\bAM\bME\bE
-       bzip2, bunzip2 - a block-sorting file compressor, v1.0
+       bzip2, bunzip2 - a block-sorting file compressor, v1.0.2
         bzcat - decompresses files to stdout
         bzip2recover - recovers data from damaged bzip2 files
  
@@ -22,20 +20,20 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
         sorting text compression algorithm,  and  Huffman  coding.
         Compression  is  generally  considerably  better than that
         achieved by more conventional LZ77/LZ78-based compressors,
-       and  approaches  the performance of the PPM family of sta-
+       and  approaches  the performance of the PPM family of sta
         tistical compressors.
  
         The command-line options are deliberately very similar  to
         those of _\bG_\bN_\bU _\bg_\bz_\bi_\bp_\b, but they are not identical.
  
-       _\bb_\bz_\bi_\bp_\b2  expects  a list of file names to accompany the com-
+       _\bb_\bz_\bi_\bp_\b2  expects  a list of file names to accompany the com
         mand-line flags.  Each file is replaced  by  a  compressed
         version  of  itself,  with  the  name "original_name.bz2".
-       Each compressed file has the same modification date,  per-
-       missions, and, when possible, ownership as the correspond-
+       Each compressed file has the same modification date,  per
+       missions, and, when possible, ownership as the correspond
         ing original, so that these properties  can  be  correctly
         restored  at  decompression  time.   File name handling is
-       naive in the sense that there is no mechanism for preserv-
+       naive in the sense that there is no mechanism for preserv
         ing  original file names, permissions, ownerships or dates
         in filesystems which lack these concepts, or have  serious
         file name length restrictions, such as MS-DOS.
@@ -58,18 +56,6 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
                filename.bz2    becomes   filename
                filename.bz     becomes   filename
                filename.tbz2   becomes   filename.tar
-
-
-
-                                                                1
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
                filename.tbz    becomes   filename.tar
                anyothername    becomes   anyothername.out
  
@@ -78,23 +64,23 @@ bzip2(1)                                                 bzip2(1)
         guess the name of the original file, and uses the original
         name with _\b._\bo_\bu_\bt appended.
  
-       As  with compression, supplying no filenames causes decom-
+       As  with compression, supplying no filenames causes decom
         pression from standard input to standard output.
  
-       _\bb_\bu_\bn_\bz_\bi_\bp_\b2 will correctly decompress a file which is the con-
+       _\bb_\bu_\bn_\bz_\bi_\bp_\b2 will correctly decompress a file which is the con
         catenation of two or more compressed files.  The result is
         the concatenation of the corresponding uncompressed files.
         Integrity testing (-t) of concatenated compressed files is
         also supported.
  
         You can also compress or decompress files to the  standard
-       output  by giving the -c flag.  Multiple files may be com-
+       output  by giving the -c flag.  Multiple files may be com
         pressed and decompressed like this.  The resulting outputs
         are  fed  sequentially to stdout.  Compression of multiple
-       files in this manner generates a stream containing  multi-
+       files in this manner generates a stream containing  multi
         ple compressed file representations.  Such a stream can be
         decompressed correctly only  by  _\bb_\bz_\bi_\bp_\b2  version  0.9.0  or
-       later.   Earlier  versions of _\bb_\bz_\bi_\bp_\b2 will stop after decom-
+       later.   Earlier  versions of _\bb_\bz_\bi_\bp_\b2 will stop after decom
         pressing the first file in the stream.
  
         _\bb_\bz_\bc_\ba_\bt (or _\bb_\bz_\bi_\bp_\b2 _\b-_\bd_\bc_\b) decompresses all specified  files  to
@@ -115,7 +101,7 @@ bzip2(1)                                                 bzip2(1)
  
         As a self-check for your  protection,  _\bb_\bz_\bi_\bp_\b2  uses  32-bit
         CRCs  to make sure that the decompressed version of a file
-       is identical to the original.  This guards against corrup-
+       is identical to the original.  This guards against corrup
         tion  of  the compressed data, and against undetected bugs
         in _\bb_\bz_\bi_\bp_\b2 (hopefully very unlikely).  The chances  of  data
         corruption  going  undetected  is  microscopic,  about one
@@ -125,17 +111,6 @@ bzip2(1)                                                 bzip2(1)
         you  recover  the original uncompressed data.  You can use
         _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br to try to recover data from damaged files.
  
-
-
-                                                                2
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
         Return values: 0 for a normal exit,  1  for  environmental
         problems  (file not found, invalid flags, I/O errors, &c),
         2 to indicate a corrupt compressed file, 3 for an internal
@@ -154,8 +129,8 @@ O\bOP\bPT\bTI\bIO\bON\bNS\bS
                and forces _\bb_\bz_\bi_\bp_\b2 to decompress.
  
         -\b-z\bz -\b--\b-c\bco\bom\bmp\bpr\bre\bes\bss\bs
-              The  complement  to -d: forces compression, regard-
-              less of the invokation name.
+              The   complement   to   -d:   forces   compression,
+              regardless of the invocation name.
  
         -\b-t\bt -\b--\b-t\bte\bes\bst\bt
                Check integrity of the specified file(s), but don't
@@ -168,6 +143,11 @@ O\bOP\bPT\bTI\bIO\bON\bNS\bS
                forces _\bb_\bz_\bi_\bp_\b2 to break hard links to files, which it
                otherwise wouldn't do.
  
+              bzip2  normally  declines to decompress files which
+              don't have the  correct  magic  header  bytes.   If
+              forced  (-f),  however,  it  will  pass  such files
+              through unmodified.  This is how GNU gzip  behaves.
+
         -\b-k\bk -\b--\b-k\bke\bee\bep\bp
                Keep  (don't delete) input files during compression
                or decompression.
@@ -190,23 +170,11 @@ O\bOP\bPT\bTI\bIO\bON\bNS\bS
         -\b-q\bq -\b--\b-q\bqu\bui\bie\bet\bt
                Suppress non-essential warning messages.   Messages
                pertaining  to I/O errors and other critical events
-
-
-
-                                                                3
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
                will not be suppressed.
  
         -\b-v\bv -\b--\b-v\bve\ber\brb\bbo\bos\bse\be
                Verbose mode -- show the compression ratio for each
-              file  processed.   Further  -v's  increase the ver-
+              file  processed.   Further  -v's  increase the ver
                bosity level, spewing out lots of information which
                is primarily of interest for diagnostic purposes.
  
@@ -214,20 +182,24 @@ bzip2(1)                                                 bzip2(1)
                Display  the  software  version,  license terms and
                conditions.
  
-       -\b-1\b1 t\bto\bo -\b-9\b9
+       -\b-1\b1 (\b(o\bor\br -\b--\b-f\bfa\bas\bst\bt)\b) t\bto\bo -\b-9\b9 (\b(o\bor\br -\b--\b-b\bbe\bes\bst\bt)\b)
                Set the block size to 100 k, 200 k ..  900  k  when
                compressing.   Has  no  effect  when decompressing.
-              See MEMORY MANAGEMENT below.
+              See MEMORY MANAGEMENT below.  The --fast and --best
+              aliases  are  primarily for GNU gzip compatibility.
+              In particular, --fast doesn't make things  signifi
+              cantly  faster.   And  --best  merely  selects  the
+              default behaviour.
  
         -\b--\b-     Treats all subsequent arguments as file names, even
-              if they start with a dash.  This is so you can han-
+              if they start with a dash.  This is so you can han
                dle files with names beginning  with  a  dash,  for
                example: bzip2 -- -myfilename.
  
         -\b--\b-r\bre\bep\bpe\bet\bti\bit\bti\biv\bve\be-\b-f\bfa\bas\bst\bt -\b--\b-r\bre\bep\bpe\bet\bti\bit\bti\biv\bve\be-\b-b\bbe\bes\bst\bt
                These  flags  are  redundant  in versions 0.9.5 and
                above.  They provided some coarse control over  the
-              behaviour  of the sorting algorithm in earlier ver-
+              behaviour  of the sorting algorithm in earlier ver
                sions, which was sometimes useful.  0.9.5 and above
                have  an  improved  algorithm  which  renders these
                flags irrelevant.
@@ -238,7 +210,7 @@ M\bME\bEM\bMO\bOR\bRY\bY M\bMA\bAN\bNA\bAG\bGE\bEM\bME\bEN\bNT\bT
         affects  both  the  compression  ratio  achieved,  and the
         amount of memory needed for compression and decompression.
         The  flags  -1  through  -9  specify  the block size to be
-       100,000 bytes through 900,000 bytes (the default)  respec-
+       100,000 bytes through 900,000 bytes (the default)  respec
         tively.   At  decompression  time, the block size used for
         compression is read from  the  header  of  the  compressed
         file, and _\bb_\bu_\bn_\bz_\bi_\bp_\b2 then allocates itself just enough memory
@@ -256,18 +228,6 @@ M\bME\bEM\bMO\bOR\bRY\bY M\bMA\bAN\bNA\bAG\bGE\bEM\bME\bEN\bNT\bT
  
         Larger  block  sizes  give  rapidly  diminishing  marginal
         returns.  Most of the compression comes from the first two
-
-
-
-                                                                4
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
         or  three hundred k of block size, a fact worth bearing in
         mind when using _\bb_\bz_\bi_\bp_\b2  on  small  machines.   It  is  also
         important  to  appreciate  that  the  decompression memory
@@ -278,13 +238,13 @@ bzip2(1)                                                 bzip2(1)
         _\bb_\bu_\bn_\bz_\bi_\bp_\b2 will require about 3700 kbytes to decompress.   To
         support decompression of any file on a 4 megabyte machine,
         _\bb_\bu_\bn_\bz_\bi_\bp_\b2 has an option to  decompress  using  approximately
-       half this amount of memory, about 2300 kbytes.  Decompres-
+       half this amount of memory, about 2300 kbytes.  Decompres
         sion speed is also halved, so you should use  this  option
         only where necessary.  The relevant flag is -s.
  
-       In general, try and use the largest block size memory con-
+       In general, try and use the largest block size memory con
         straints  allow,  since  that  maximises  the  compression
-       achieved.   Compression and decompression speed are virtu-
+       achieved.   Compression and decompression speed are virtu
         ally unaffected by block size.
  
         Another significant point applies to files which fit in  a
@@ -300,11 +260,11 @@ bzip2(1)                                                 bzip2(1)
  
         Here  is a table which summarises the maximum memory usage
         for different block sizes.  Also  recorded  is  the  total
-       compressed  size for 14 files of the Calgary Text Compres-
+       compressed  size for 14 files of the Calgary Text Compres
         sion Corpus totalling 3,141,622 bytes.  This column  gives
         some  feel  for  how  compression  varies with block size.
         These figures tend to understate the advantage  of  larger
-       block  sizes  for  larger files, since the Corpus is domi-
+       block  sizes  for  larger files, since the Corpus is domi
         nated by smaller files.
  
                    Compress   Decompress   Decompress   Corpus
@@ -321,22 +281,9 @@ bzip2(1)                                                 bzip2(1)
              -9      7600k      3700k        2350k      828642
  
  
-
-
-
-
-                                                                5
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
  R\bRE\bEC\bCO\bOV\bVE\bER\bRI\bIN\bNG\bG D\bDA\bAT\bTA\bA F\bFR\bRO\bOM\bM D\bDA\bAM\bMA\bAG\bGE\bED\bD F\bFI\bIL\bLE\bES\bS
         _\bb_\bz_\bi_\bp_\b2 compresses files in blocks, usually 900kbytes  long.
-       Each block is handled independently.  If a media or trans-
+       Each block is handled independently.  If a media or trans
         mission error causes a multi-block  .bz2  file  to  become
         damaged,  it  may  be  possible  to  recover data from the
         undamaged blocks in the file.
@@ -353,19 +300,19 @@ R\bRE\bEC\bCO\bOV\bVE\bER\bRI\bIN\bNG\bG D\bDA\bAT\bTA\bA F\bFR\bRO\bOM\bM D\bDA\bAM\bMA\bAG\bGE\bED\bD F
         the integrity of the resulting files, and decompress those
         which are undamaged.
  
-       _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br takes a single argument, the name of the dam-
-       aged file, and writes a number of files "rec0001file.bz2",
-       "rec0002file.bz2", etc, containing the  extracted  blocks.
-       The  output  filenames  are  designed  so  that the use of
-       wildcards in subsequent processing -- for example,  "bzip2
-       -dc   rec*file.bz2 > recovered_data" -- lists the files in
-       the correct order.
+       _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br takes a single argument, the name of the dam
+       aged    file,    and    writes    a    number   of   files
+       "rec00001file.bz2",  "rec00002file.bz2",  etc,  containing
+       the   extracted   blocks.   The   output   filenames   are
+       designed  so  that the use of wildcards in subsequent pro
+       cessing  -- for example, "bzip2 -dc  rec*file.bz2 > recov
+       ered_data" -- processes the files in the correct order.
  
         _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br should be of most use dealing with large .bz2
         files,  as  these will contain many blocks.  It is clearly
         futile to use it on damaged single-block  files,  since  a
-       damaged  block  cannot  be recovered.  If you wish to min-
-       imise any potential data loss through media  or  transmis-
+       damaged  block  cannot  be recovered.  If you wish to min
+       imise any potential data loss through media  or  transmis
         sion errors, you might consider compressing with a smaller
         block size.
  
@@ -379,31 +326,19 @@ P\bPE\bER\bRF\bFO\bOR\bRM\bMA\bAN\bNC\bCE\bE N\bNO\bOT\bTE\bES\bS
         better  than previous versions in this respect.  The ratio
         between worst-case and average-case compression time is in
         the  region  of  10:1.  For previous versions, this figure
-       was more like 100:1.  You can use the -vvvv option to mon-
+       was more like 100:1.  You can use the -vvvv option to mon
         itor progress in great detail, if you want.
  
         Decompression speed is unaffected by these phenomena.
  
         _\bb_\bz_\bi_\bp_\b2  usually  allocates  several  megabytes of memory to
-       operate in, and then charges all over it in a fairly  ran-
-       dom  fashion.   This means that performance, both for com-
+       operate in, and then charges all over it in a fairly  ran
+       dom  fashion.   This means that performance, both for com
         pressing and decompressing, is largely determined  by  the
-
-
-
-                                                                6
-
-
-
-
-
-bzip2(1)                                                 bzip2(1)
-
-
         speed  at  which  your  machine  can service cache misses.
         Because of this, small changes to the code to  reduce  the
         miss  rate  have  been observed to give disproportionately
-       large performance improvements.  I imagine _\bb_\bz_\bi_\bp_\b2 will per-
+       large performance improvements.  I imagine _\bb_\bz_\bi_\bp_\b2 will per
         form best on machines with very large caches.
  
  
@@ -413,50 +348,51 @@ C\bCA\bAV\bVE\bEA\bAT\bTS\bS
         but  the  details  of  what  the problem is sometimes seem
         rather misleading.
  
-       This manual page pertains to version 1.0 of  _\bb_\bz_\bi_\bp_\b2_\b.   Com-
+       This manual page pertains to version 1.0.2 of _\bb_\bz_\bi_\bp_\b2_\b.  Com
         pressed  data created by this version is entirely forwards
         and  backwards  compatible  with   the   previous   public
-       releases,  versions  0.1pl2, 0.9.0 and 0.9.5, but with the
-       following exception: 0.9.0 and above can correctly  decom-
-       press multiple concatenated compressed files.  0.1pl2 can-
-       not do this; it will stop  after  decompressing  just  the
-       first file in the stream.
+       releases,  versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
+       but with the following exception: 0.9.0 and above can cor
+       rectly  decompress multiple concatenated compressed files.
+       0.1pl2 cannot do this; it will  stop  after  decompressing
+       just the first file in the stream.
+
+       _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br  versions  prior  to  this  one,  1.0.2, used
+       32-bit integers to represent bit positions  in  compressed
+       files,  so  it could not handle compressed files more than
+       512 megabytes long.  Version 1.0.2 and above  uses  64-bit
+       ints  on  some platforms which support them (GNU supported
+       targets,  and  Windows).   To  establish  whether  or  not
+       bzip2recover  was  built  with  such  a limitation, run it
+       without arguments.  In any event you can build yourself an
+       unlimited version if you can recompile it with MaybeUInt64
+       set to be an unsigned 64-bit integer.
+
  
-       _\bb_\bz_\bi_\bp_\b2_\br_\be_\bc_\bo_\bv_\be_\br  uses  32-bit integers to represent bit posi-
-       tions in compressed files, so it cannot handle  compressed
-       files  more than 512 megabytes long.  This could easily be
-       fixed.
  
  
  A\bAU\bUT\bTH\bHO\bOR\bR
         Julian Seward, jseward@acm.org.
  
-       http://sourceware.cygnus.com/bzip2
-       http://www.muraroa.demon.co.uk
+       http://sources.redhat.com/bzip2
  
-       The ideas embodied in _\bb_\bz_\bi_\bp_\b2 are due to (at least) the fol-
-       lowing people: Michael Burrows and David Wheeler (for  the
-       block  sorting  transformation), David Wheeler (again, for
-       the Huffman coder), Peter Fenwick (for the structured cod-
+       The ideas embodied in _\bb_\bz_\bi_\bp_\b2 are due to (at least) the fol
+       lowing  people: Michael Burrows and David Wheeler (for the
+       block sorting transformation), David Wheeler  (again,  for
+       the Huffman coder), Peter Fenwick (for the structured cod
         ing model in the original _\bb_\bz_\bi_\bp_\b, and many refinements), and
-       Alistair Moffat, Radford Neal  and  Ian  Witten  (for  the
+       Alistair  Moffat,  Radford  Neal  and  Ian Witten (for the
         arithmetic  coder  in  the  original  _\bb_\bz_\bi_\bp_\b)_\b.   I  am  much
-       indebted for their help, support and advice.  See the man-
-       ual  in the source distribution for pointers to sources of
+       indebted for their help, support and advice.  See the man
+       ual in the source distribution for pointers to sources  of
         documentation.  Christian von Roques encouraged me to look
-       for  faster sorting algorithms, so as to speed up compres-
+       for faster sorting algorithms, so as to speed up  compres
         sion.  Bela Lubkin encouraged me to improve the worst-case
-       compression performance.  Many people sent patches, helped
-       with portability problems, lent machines, gave advice  and
-       were generally helpful.
-
-
-
-
-
-
-
+       compression performance.  The bz* scripts are derived from
+       those  of GNU gzip.  Many people sent patches, helped with
+       portability problems, lent machines, gave advice and  were
+       generally helpful.
  
-                                                                7
  
  
+                                                         bzip2(1)
diff --git a/bzip2.c b/bzip2.c

index 56adfdcbc46b4c2166867f1ee0077edd35d47253..807f420aed6b6bb1552f00d2eff0d4b08ce70751 100644 (file)
--- a/bzip2.c
+++ b/bzip2.c
@@ -7,7 +7,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -113,13 +113,16 @@
  /*--
    Generic 32-bit Unix.
    Also works on 64-bit Unix boxes.
+  This is the default.
  --*/
  #define BZ_UNIX      1
  
  /*--
    Win32, as seen by Jacob Navia's excellent
    port of (Chris Fraser & David Hanson)'s excellent
-  lcc compiler.
+  lcc compiler.  Or with MS Visual C.
+  This is selected automatically if compiled by a compiler which
+  defines _WIN32, not including the Cygwin GCC.
  --*/
  #define BZ_LCCWIN32  0
  
@@ -156,6 +159,7 @@
  --*/
  
  #if BZ_UNIX
+#   include <fcntl.h>
  #   include <sys/types.h>
  #   include <utime.h>
  #   include <unistd.h>
@@ -164,8 +168,9 @@
  
  #   define PATH_SEP    '/'
  #   define MY_LSTAT    lstat
-#   define MY_S_IFREG  S_ISREG
  #   define MY_STAT     stat
+#   define MY_S_ISREG  S_ISREG
+#   define MY_S_ISDIR  S_ISDIR
  
  #   define APPEND_FILESPEC(root, name) \
        root=snocString((root), (name))
@@ -180,19 +185,23 @@
  #   else
  #      define NORETURN /**/
  #   endif
+
  #   ifdef __DJGPP__
  #     include <io.h>
  #     include <fcntl.h>
  #     undef MY_LSTAT
+#     undef MY_STAT
  #     define MY_LSTAT stat
+#     define MY_STAT stat
  #     undef SET_BINARY_MODE
  #     define SET_BINARY_MODE(fd)                        \
          do {                                            \
             int retVal = setmode ( fileno ( fd ),        \
-                                 O_BINARY );            \
+                                  O_BINARY );           \
             ERROR_IF_MINUS_ONE ( retVal );               \
          } while ( 0 )
  #   endif
+
  #   ifdef __CYGWIN__
  #     include <io.h>
  #     include <fcntl.h>
@@ -200,11 +209,11 @@
  #     define SET_BINARY_MODE(fd)                        \
          do {                                            \
             int retVal = setmode ( fileno ( fd ),        \
-                                 O_BINARY );            \
+                                  O_BINARY );           \
             ERROR_IF_MINUS_ONE ( retVal );               \
          } while ( 0 )
  #   endif
-#endif
+#endif /* BZ_UNIX */
  
  
  
@@ -217,46 +226,23 @@
  #   define PATH_SEP       '\\'
  #   define MY_LSTAT       _stat
  #   define MY_STAT        _stat
-#   define MY_S_IFREG(x)  ((x) & _S_IFREG)
+#   define MY_S_ISREG(x)  ((x) & _S_IFREG)
+#   define MY_S_ISDIR(x)  ((x) & _S_IFDIR)
  
  #   define APPEND_FLAG(root, name) \
        root=snocString((root), (name))
  
-#   if 0
-   /*-- lcc-win32 seems to expand wildcards itself --*/
-#   define APPEND_FILESPEC(root, spec)                \
-      do {                                            \
-         if ((spec)[0] == '-') {                      \
-            root = snocString((root), (spec));        \
-         } else {                                     \
-            struct _finddata_t c_file;                \
-            long hFile;                               \
-            hFile = _findfirst((spec), &c_file);      \
-            if ( hFile == -1L ) {                     \
-               root = snocString ((root), (spec));    \
-            } else {                                  \
-               int anInt = 0;                         \
-               while ( anInt == 0 ) {                 \
-                  root = snocString((root),           \
-                            &c_file.name[0]);         \
-                  anInt = _findnext(hFile, &c_file);  \
-               }                                      \
-            }                                         \
-         }                                            \
-      } while ( 0 )
-#   else
  #   define APPEND_FILESPEC(root, name)                \
        root = snocString ((root), (name))
-#   endif
  
  #   define SET_BINARY_MODE(fd)                        \
        do {                                            \
           int retVal = setmode ( fileno ( fd ),        \
-                               O_BINARY );            \
+                                O_BINARY );           \
           ERROR_IF_MINUS_ONE ( retVal );               \
        } while ( 0 )
  
-#endif
+#endif /* BZ_LCCWIN32 */
  
  
  /*---------------------------------------------*/
@@ -338,6 +324,7 @@ typedef
     struct { UChar b[8]; } 
     UInt64;
  
+
  static
  void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
  {
@@ -351,6 +338,7 @@ void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
     n->b[0] = (UChar) (lo32        & 0xFF);
  }
  
+
  static
  double uInt64_to_double ( UInt64* n )
  {
@@ -364,77 +352,6 @@ double uInt64_to_double ( UInt64* n )
     return sum;
  }
  
-static
-void uInt64_add ( UInt64* src, UInt64* dst )
-{
-   Int32 i;
-   Int32 carry = 0;
-   for (i = 0; i < 8; i++) {
-      carry += ( ((Int32)src->b[i]) + ((Int32)dst->b[i]) );
-      dst->b[i] = (UChar)(carry & 0xFF);
-      carry >>= 8;
-   }
-}
-
-static
-void uInt64_sub ( UInt64* src, UInt64* dst )
-{
-   Int32 t, i;
-   Int32 borrow = 0;
-   for (i = 0; i < 8; i++) {
-      t = ((Int32)dst->b[i]) - ((Int32)src->b[i]) - borrow;
-      if (t < 0) {
-         dst->b[i] = (UChar)(t + 256);
-         borrow = 1;
-      } else {
-         dst->b[i] = (UChar)t;
-         borrow = 0;
-      }
-   }
-}
-
-static
-void uInt64_mul ( UInt64* a, UInt64* b, UInt64* r_hi, UInt64* r_lo )
-{
-   UChar sum[16];
-   Int32 ia, ib, carry;
-   for (ia = 0; ia < 16; ia++) sum[ia] = 0;
-   for (ia = 0; ia < 8; ia++) {
-      carry = 0;
-      for (ib = 0; ib < 8; ib++) {
-         carry += ( ((Int32)sum[ia+ib]) 
-                    + ((Int32)a->b[ia]) * ((Int32)b->b[ib]) );
-         sum[ia+ib] = (UChar)(carry & 0xFF);
-         carry >>= 8;
-      }
-      sum[ia+8] = (UChar)(carry & 0xFF);
-      if ((carry >>= 8) != 0) panic ( "uInt64_mul" );
-   }
-
-   for (ia = 0; ia < 8; ia++) r_hi->b[ia] = sum[ia+8];
-   for (ia = 0; ia < 8; ia++) r_lo->b[ia] = sum[ia];
-}
-
-
-static
-void uInt64_shr1 ( UInt64* n )
-{
-   Int32 i;
-   for (i = 0; i < 8; i++) {
-      n->b[i] >>= 1;
-      if (i < 7 && (n->b[i+1] & 1)) n->b[i] |= 0x80;
-   }
-}
-
-static
-void uInt64_shl1 ( UInt64* n )
-{
-   Int32 i;
-   for (i = 7; i >= 0; i--) {
-      n->b[i] <<= 1;
-      if (i > 0 && (n->b[i-1] & 0x80)) n->b[i]++;
-   }
-}
  
  static
  Bool uInt64_isZero ( UInt64* n )
@@ -445,49 +362,23 @@ Bool uInt64_isZero ( UInt64* n )
     return 1;
  }
  
-static
+
+/* Divide *n by 10, and return the remainder.  */
+static 
  Int32 uInt64_qrm10 ( UInt64* n )
  {
-   /* Divide *n by 10, and return the remainder.  Long division
-      is difficult, so we cheat and instead multiply by
-      0xCCCC CCCC CCCC CCCD, which is 0.8 (viz, 0.1 << 3).
-   */
+   UInt32 rem, tmp;
     Int32  i;
-   UInt64 tmp1, tmp2, n_orig, zero_point_eight;
-
-   zero_point_eight.b[1] = zero_point_eight.b[2] = 
-   zero_point_eight.b[3] = zero_point_eight.b[4] = 
-   zero_point_eight.b[5] = zero_point_eight.b[6] = 
-   zero_point_eight.b[7] = 0xCC;
-   zero_point_eight.b[0] = 0xCD;
-
-   n_orig = *n;
-
-   /* divide n by 10, 
-      by multiplying by 0.8 and then shifting right 3 times */
-   uInt64_mul ( n, &zero_point_eight, &tmp1, &tmp2 );
-   uInt64_shr1(&tmp1); uInt64_shr1(&tmp1); uInt64_shr1(&tmp1); 
-   *n = tmp1;
-   
-   /* tmp1 = 8*n, tmp2 = 2*n */
-   uInt64_shl1(&tmp1); uInt64_shl1(&tmp1); uInt64_shl1(&tmp1);
-   tmp2 = *n; uInt64_shl1(&tmp2);
-
-   /* tmp1 = 10*n */
-   uInt64_add ( &tmp2, &tmp1 );
-
-   /* n_orig = n_orig - 10*n */
-   uInt64_sub ( &tmp1, &n_orig );
-
-   /* n_orig should now hold quotient, in range 0 .. 9 */
-   for (i = 7; i >= 1; i--) 
-      if (n_orig.b[i] != 0) panic ( "uInt64_qrm10(1)" );
-   if (n_orig.b[0] > 9)
-      panic ( "uInt64_qrm10(2)" );
-
-   return (int)n_orig.b[0];
+   rem = 0;
+   for (i = 7; i >= 0; i--) {
+      tmp = rem * 256 + n->b[i];
+      n->b[i] = tmp / 10;
+      rem = tmp % 10;
+   }
+   return rem;
  }
  
+
  /* ... and the Whole Entire Point of all this UInt64 stuff is
     so that we can supply the following function.
  */
@@ -504,7 +395,8 @@ void uInt64_toAscii ( char* outbuf, UInt64* n )
        nBuf++;
     } while (!uInt64_isZero(&n_copy));
     outbuf[nBuf] = 0;
-   for (i = 0; i < nBuf; i++) outbuf[i] = buf[nBuf-i-1];
+   for (i = 0; i < nBuf; i++) 
+      outbuf[i] = buf[nBuf-i-1];
  }
  
  
@@ -566,35 +458,38 @@ void compressStream ( FILE *stream, FILE *zStream )
     if (ret == EOF) goto errhandler_io;
     if (zStream != stdout) {
        ret = fclose ( zStream );
+      outputHandleJustInCase = NULL;
        if (ret == EOF) goto errhandler_io;
     }
+   outputHandleJustInCase = NULL;
     if (ferror(stream)) goto errhandler_io;
     ret = fclose ( stream );
     if (ret == EOF) goto errhandler_io;
  
-   if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0) 
-      nbytes_in_lo32 = 1;
-
     if (verbosity >= 1) {
-      Char   buf_nin[32], buf_nout[32];
-      UInt64 nbytes_in,   nbytes_out;
-      double nbytes_in_d, nbytes_out_d;
-      uInt64_from_UInt32s ( &nbytes_in, 
-                            nbytes_in_lo32, nbytes_in_hi32 );
-      uInt64_from_UInt32s ( &nbytes_out, 
-                            nbytes_out_lo32, nbytes_out_hi32 );
-      nbytes_in_d  = uInt64_to_double ( &nbytes_in );
-      nbytes_out_d = uInt64_to_double ( &nbytes_out );
-      uInt64_toAscii ( buf_nin, &nbytes_in );
-      uInt64_toAscii ( buf_nout, &nbytes_out );
-      fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
-                        "%5.2f%% saved, %s in, %s out.\n",
-                nbytes_in_d / nbytes_out_d,
-                (8.0 * nbytes_out_d) / nbytes_in_d,
-                100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
-                buf_nin,
-                buf_nout
-              );
+      if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0) {
+        fprintf ( stderr, " no data compressed.\n");
+      } else {
+        Char   buf_nin[32], buf_nout[32];
+        UInt64 nbytes_in,   nbytes_out;
+        double nbytes_in_d, nbytes_out_d;
+        uInt64_from_UInt32s ( &nbytes_in, 
+                              nbytes_in_lo32, nbytes_in_hi32 );
+        uInt64_from_UInt32s ( &nbytes_out, 
+                              nbytes_out_lo32, nbytes_out_hi32 );
+        nbytes_in_d  = uInt64_to_double ( &nbytes_in );
+        nbytes_out_d = uInt64_to_double ( &nbytes_out );
+        uInt64_toAscii ( buf_nin, &nbytes_in );
+        uInt64_toAscii ( buf_nout, &nbytes_out );
+        fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
+                  "%5.2f%% saved, %s in, %s out.\n",
+                  nbytes_in_d / nbytes_out_d,
+                  (8.0 * nbytes_out_d) / nbytes_in_d,
+                  100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
+                  buf_nin,
+                  buf_nout
+                );
+      }
     }
  
     return;
@@ -652,7 +547,7 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
  
        while (bzerr == BZ_OK) {
           nread = BZ2_bzRead ( &bzerr, bzf, obuf, 5000 );
-         if (bzerr == BZ_DATA_ERROR_MAGIC) goto errhandler;
+         if (bzerr == BZ_DATA_ERROR_MAGIC) goto trycat;
           if ((bzerr == BZ_OK || bzerr == BZ_STREAM_END) && nread > 0)
              fwrite ( obuf, sizeof(UChar), nread, stream );
           if (ferror(stream)) goto errhandler_io;
@@ -668,9 +563,9 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
        if (bzerr != BZ_OK) panic ( "decompress:bzReadGetUnused" );
  
        if (nUnused == 0 && myfeof(zStream)) break;
-
     }
  
+   closeok:
     if (ferror(zStream)) goto errhandler_io;
     ret = fclose ( zStream );
     if (ret == EOF) goto errhandler_io;
@@ -680,11 +575,26 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
     if (ret != 0) goto errhandler_io;
     if (stream != stdout) {
        ret = fclose ( stream );
+      outputHandleJustInCase = NULL;
        if (ret == EOF) goto errhandler_io;
     }
+   outputHandleJustInCase = NULL;
     if (verbosity >= 2) fprintf ( stderr, "\n    " );
     return True;
  
+   trycat: 
+   if (forceOverwrite) {
+      rewind(zStream);
+      while (True) {
+        if (myfeof(zStream)) break;
+        nread = fread ( obuf, sizeof(UChar), 5000, zStream );
+        if (ferror(zStream)) goto errhandler_io;
+        if (nread > 0) fwrite ( obuf, sizeof(UChar), nread, stream );
+        if (ferror(stream)) goto errhandler_io;
+      }
+      goto closeok;
+   }
+  
     errhandler:
     BZ2_bzReadClose ( &bzerr_dummy, bzf );
     switch (bzerr) {
@@ -832,7 +742,7 @@ void cadvise ( void )
        stderr,
        "\nIt is possible that the compressed file(s) have become corrupted.\n"
          "You can use the -tvv option to test integrity of such files.\n\n"
-        "You can use the `bzip2recover' program to *attempt* to recover\n"
+        "You can use the `bzip2recover' program to attempt to recover\n"
          "data from undamaged sections of corrupted files.\n\n"
      );
  }
@@ -855,28 +765,55 @@ void showFileNames ( void )
  static 
  void cleanUpAndFail ( Int32 ec )
  {
-   IntNative retVal;
+   IntNative      retVal;
+   struct MY_STAT statBuf;
  
     if ( srcMode == SM_F2F 
          && opMode != OM_TEST
          && deleteOutputOnInterrupt ) {
-      if (noisy)
-      fprintf ( stderr, "%s: Deleting output file %s, if it exists.\n",
-                progName, outName );
-      if (outputHandleJustInCase != NULL)
-         fclose ( outputHandleJustInCase );
-      retVal = remove ( outName );
-      if (retVal != 0)
+
+      /* Check whether input file still exists.  Delete output file
+         only if input exists to avoid loss of data.  Joerg Prante, 5
+         January 2002.  (JRS 06-Jan-2002: other changes in 1.0.2 mean
+         this is less likely to happen.  But to be ultra-paranoid, we
+         do the check anyway.)  */
+      retVal = MY_STAT ( inName, &statBuf );
+      if (retVal == 0) {
+         if (noisy)
+            fprintf ( stderr, 
+                      "%s: Deleting output file %s, if it exists.\n",
+                      progName, outName );
+         if (outputHandleJustInCase != NULL)
+            fclose ( outputHandleJustInCase );
+         retVal = remove ( outName );
+         if (retVal != 0)
+            fprintf ( stderr,
+                      "%s: WARNING: deletion of output file "
+                      "(apparently) failed.\n",
+                      progName );
+      } else {
           fprintf ( stderr,
-                   "%s: WARNING: deletion of output file (apparently) failed.\n",
+                   "%s: WARNING: deletion of output file suppressed\n",
+                    progName );
+         fprintf ( stderr,
+                   "%s:    since input file no longer exists.  Output file\n",
                     progName );
+         fprintf ( stderr,
+                   "%s:    `%s' may be incomplete.\n",
+                   progName, outName );
+         fprintf ( stderr, 
+                   "%s:    I suggest doing an integrity test (bzip2 -tv)"
+                   " of it.\n",
+                   progName );
+      }
     }
+
     if (noisy && numFileNames > 0 && numFilesProcessed < numFileNames) {
        fprintf ( stderr, 
                  "%s: WARNING: some files have not been processed:\n"
-                "\t%d specified on command line, %d not processed yet.\n\n",
-                progName, numFileNames, 
-                          numFileNames - numFilesProcessed );
+                "%s:    %d specified on command line, %d not processed yet.\n\n",
+                progName, progName,
+                numFileNames, numFileNames - numFilesProcessed );
     }
     setExit(ec);
     exit(exitValue);
@@ -915,14 +852,16 @@ void crcError ( void )
  static 
  void compressedStreamEOF ( void )
  {
-   fprintf ( stderr,
-             "\n%s: Compressed file ends unexpectedly;\n\t"
-             "perhaps it is corrupted?  *Possible* reason follows.\n",
-             progName );
-   perror ( progName );
-   showFileNames();
-   cadvise();
-   cleanUpAndFail( 2 );
+  if (noisy) {
+    fprintf ( stderr,
+             "\n%s: Compressed file ends unexpectedly;\n\t"
+             "perhaps it is corrupted?  *Possible* reason follows.\n",
+             progName );
+    perror ( progName );
+    showFileNames();
+    cadvise();
+  }
+  cleanUpAndFail( 2 );
  }
  
  
@@ -1038,6 +977,11 @@ void configError ( void )
  /*--- The main driver machinery                   ---*/
  /*---------------------------------------------------*/
  
+/* All rather crufty.  The main problem is that input files
+   are stat()d multiple times before use.  This should be
+   cleaned up. 
+*/
+
  /*---------------------------------------------*/
  static 
  void pad ( Char *s )
@@ -1081,6 +1025,32 @@ Bool fileExists ( Char* name )
  }
  
  
+/*---------------------------------------------*/
+/* Open an output file safely with O_EXCL and good permissions.
+   This avoids a race condition in versions < 1.0.2, in which
+   the file was first opened and then had its interim permissions
+   set safely.  We instead use open() to create the file with
+   the interim permissions required. (--- --- rw-).
+
+   For non-Unix platforms, if we are not worrying about
+   security issues, simple this simply behaves like fopen.
+*/
+FILE* fopen_output_safely ( Char* name, const char* mode )
+{
+#  if BZ_UNIX
+   FILE*     fp;
+   IntNative fh;
+   fh = open(name, O_WRONLY|O_CREAT|O_EXCL, S_IWUSR|S_IRUSR);
+   if (fh == -1) return NULL;
+   fp = fdopen(fh, mode);
+   if (fp == NULL) close(fh);
+   return fp;
+#  else
+   return fopen(name, mode);
+#  endif
+}
+
+
  /*---------------------------------------------*/
  /*--
    if in doubt, return True
@@ -1093,7 +1063,7 @@ Bool notAStandardFile ( Char* name )
  
     i = MY_LSTAT ( name, &statBuf );
     if (i != 0) return True;
-   if (MY_S_IFREG(statBuf.st_mode)) return False;
+   if (MY_S_ISREG(statBuf.st_mode)) return False;
     return True;
  }
  
@@ -1115,42 +1085,66 @@ Int32 countHardLinks ( Char* name )
  
  
  /*---------------------------------------------*/
+/* Copy modification date, access date, permissions and owner from the
+   source to destination file.  We have to copy this meta-info off
+   into fileMetaInfo before starting to compress / decompress it,
+   because doing it afterwards means we get the wrong access time.
+
+   To complicate matters, in compress() and decompress() below, the
+   sequence of tests preceding the call to saveInputFileMetaInfo()
+   involves calling fileExists(), which in turn establishes its result
+   by attempting to fopen() the file, and if successful, immediately
+   fclose()ing it again.  So we have to assume that the fopen() call
+   does not cause the access time field to be updated.
+
+   Reading of the man page for stat() (man 2 stat) on RedHat 7.2 seems
+   to imply that merely doing open() will not affect the access time.
+   Therefore we merely need to hope that the C library only does
+   open() as a result of fopen(), and not any kind of read()-ahead
+   cleverness.
+
+   It sounds pretty fragile to me.  Whether this carries across
+   robustly to arbitrary Unix-like platforms (or even works robustly
+   on this one, RedHat 7.2) is unknown to me.  Nevertheless ...  
+*/
+#if BZ_UNIX
+static 
+struct MY_STAT fileMetaInfo;
+#endif
+
  static 
-void copyDatePermissionsAndOwner ( Char *srcName, Char *dstName )
+void saveInputFileMetaInfo ( Char *srcName )
  {
-#if BZ_UNIX
+#  if BZ_UNIX
+   IntNative retVal;
+   /* Note use of stat here, not lstat. */
+   retVal = MY_STAT( srcName, &fileMetaInfo );
+   ERROR_IF_NOT_ZERO ( retVal );
+#  endif
+}
+
+
+static 
+void applySavedMetaInfoToOutputFile ( Char *dstName )
+{
+#  if BZ_UNIX
     IntNative      retVal;
-   struct MY_STAT statBuf;
     struct utimbuf uTimBuf;
  
-   retVal = MY_LSTAT ( srcName, &statBuf );
-   ERROR_IF_NOT_ZERO ( retVal );
-   uTimBuf.actime = statBuf.st_atime;
-   uTimBuf.modtime = statBuf.st_mtime;
+   uTimBuf.actime = fileMetaInfo.st_atime;
+   uTimBuf.modtime = fileMetaInfo.st_mtime;
  
-   retVal = chmod ( dstName, statBuf.st_mode );
+   retVal = chmod ( dstName, fileMetaInfo.st_mode );
     ERROR_IF_NOT_ZERO ( retVal );
  
     retVal = utime ( dstName, &uTimBuf );
     ERROR_IF_NOT_ZERO ( retVal );
  
-   retVal = chown ( dstName, statBuf.st_uid, statBuf.st_gid );
+   retVal = chown ( dstName, fileMetaInfo.st_uid, fileMetaInfo.st_gid );
     /* chown() will in many cases return with EPERM, which can
        be safely ignored.
     */
-#endif
-}
-
-
-/*---------------------------------------------*/
-static 
-void setInterimPermissions ( Char *dstName )
-{
-#if BZ_UNIX
-   IntNative      retVal;
-   retVal = chmod ( dstName, S_IRUSR | S_IWUSR );
-   ERROR_IF_NOT_ZERO ( retVal );
-#endif
+#  endif
  }
  
  
@@ -1158,10 +1152,19 @@ void setInterimPermissions ( Char *dstName )
  static 
  Bool containsDubiousChars ( Char* name )
  {
-   Bool cdc = False;
+#  if BZ_UNIX
+   /* On unix, files can contain any characters and the file expansion
+    * is performed by the shell.
+    */
+   return False;
+#  else /* ! BZ_UNIX */
+   /* On non-unix (Win* platforms), wildcard characters are not allowed in 
+    * filenames.
+    */
     for (; *name != '\0'; name++)
-      if (*name == '?' || *name == '*') cdc = True;
-   return cdc;
+      if (*name == '?' || *name == '*') return True;
+   return False;
+#  endif /* BZ_UNIX */
  }
  
  
@@ -1201,6 +1204,7 @@ void compress ( Char *name )
     FILE  *inStr;
     FILE  *outStr;
     Int32 n, i;
+   struct MY_STAT statBuf;
  
     deleteOutputOnInterrupt = False;
  
@@ -1246,6 +1250,16 @@ void compress ( Char *name )
           return;
        }
     }
+   if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
+      MY_STAT(inName, &statBuf);
+      if ( MY_S_ISDIR(statBuf.st_mode) ) {
+         fprintf( stderr,
+                  "%s: Input file %s is a directory.\n",
+                  progName,inName);
+         setExit(1);
+         return;
+      }
+   }
     if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
        if (noisy)
        fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
@@ -1253,11 +1267,15 @@ void compress ( Char *name )
        setExit(1);
        return;
     }
-   if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
-      fprintf ( stderr, "%s: Output file %s already exists.\n",
-                progName, outName );
-      setExit(1);
-      return;
+   if ( srcMode == SM_F2F && fileExists ( outName ) ) {
+      if (forceOverwrite) {
+        remove(outName);
+      } else {
+        fprintf ( stderr, "%s: Output file %s already exists.\n",
+                  progName, outName );
+        setExit(1);
+        return;
+      }
     }
     if ( srcMode == SM_F2F && !forceOverwrite &&
          (n=countHardLinks ( inName )) > 0) {
@@ -1267,6 +1285,12 @@ void compress ( Char *name )
        return;
     }
  
+   if ( srcMode == SM_F2F ) {
+      /* Save the file's meta-info before we open it.  Doing it later
+         means we mess up the access times. */
+      saveInputFileMetaInfo ( inName );
+   }
+
     switch ( srcMode ) {
  
        case SM_I2O:
@@ -1306,7 +1330,7 @@ void compress ( Char *name )
  
        case SM_F2F:
           inStr = fopen ( inName, "rb" );
-         outStr = fopen ( outName, "wb" );
+         outStr = fopen_output_safely ( outName, "wb" );
           if ( outStr == NULL) {
              fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
                        progName, outName, strerror(errno) );
@@ -1321,7 +1345,6 @@ void compress ( Char *name )
              setExit(1);
              return;
           };
-         setInterimPermissions ( outName );
           break;
  
        default:
@@ -1343,7 +1366,7 @@ void compress ( Char *name )
  
     /*--- If there was an I/O error, we won't get here. ---*/
     if ( srcMode == SM_F2F ) {
-      copyDatePermissionsAndOwner ( inName, outName );
+      applySavedMetaInfoToOutputFile ( outName );
        deleteOutputOnInterrupt = False;
        if ( !keepInputFiles ) {
           IntNative retVal = remove ( inName );
@@ -1364,6 +1387,7 @@ void uncompress ( Char *name )
     Int32 n, i;
     Bool  magicNumberOK;
     Bool  cantGuess;
+   struct MY_STAT statBuf;
  
     deleteOutputOnInterrupt = False;
  
@@ -1405,6 +1429,16 @@ void uncompress ( Char *name )
        setExit(1);
        return;
     }
+   if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
+      MY_STAT(inName, &statBuf);
+      if ( MY_S_ISDIR(statBuf.st_mode) ) {
+         fprintf( stderr,
+                  "%s: Input file %s is a directory.\n",
+                  progName,inName);
+         setExit(1);
+         return;
+      }
+   }
     if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
        if (noisy)
        fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
@@ -1419,11 +1453,15 @@ void uncompress ( Char *name )
                  progName, inName, outName );
        /* just a warning, no return */
     }   
-   if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
-      fprintf ( stderr, "%s: Output file %s already exists.\n",
-                progName, outName );
-      setExit(1);
-      return;
+   if ( srcMode == SM_F2F && fileExists ( outName ) ) {
+      if (forceOverwrite) {
+       remove(outName);
+      } else {
+        fprintf ( stderr, "%s: Output file %s already exists.\n",
+                  progName, outName );
+        setExit(1);
+        return;
+      }
     }
     if ( srcMode == SM_F2F && !forceOverwrite &&
          (n=countHardLinks ( inName ) ) > 0) {
@@ -1433,6 +1471,12 @@ void uncompress ( Char *name )
        return;
     }
  
+   if ( srcMode == SM_F2F ) {
+      /* Save the file's meta-info before we open it.  Doing it later
+         means we mess up the access times. */
+      saveInputFileMetaInfo ( inName );
+   }
+
     switch ( srcMode ) {
  
        case SM_I2O:
@@ -1463,7 +1507,7 @@ void uncompress ( Char *name )
  
        case SM_F2F:
           inStr = fopen ( inName, "rb" );
-         outStr = fopen ( outName, "wb" );
+         outStr = fopen_output_safely ( outName, "wb" );
           if ( outStr == NULL) {
              fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
                        progName, outName, strerror(errno) );
@@ -1478,7 +1522,6 @@ void uncompress ( Char *name )
              setExit(1);
              return;
           };
-         setInterimPermissions ( outName );
           break;
  
        default:
@@ -1501,7 +1544,7 @@ void uncompress ( Char *name )
     /*--- If there was an I/O error, we won't get here. ---*/
     if ( magicNumberOK ) {
        if ( srcMode == SM_F2F ) {
-         copyDatePermissionsAndOwner ( inName, outName );
+         applySavedMetaInfoToOutputFile ( outName );
           deleteOutputOnInterrupt = False;
           if ( !keepInputFiles ) {
              IntNative retVal = remove ( inName );
@@ -1539,6 +1582,7 @@ void testf ( Char *name )
  {
     FILE *inStr;
     Bool allOK;
+   struct MY_STAT statBuf;
  
     deleteOutputOnInterrupt = False;
  
@@ -1565,6 +1609,16 @@ void testf ( Char *name )
        setExit(1);
        return;
     }
+   if ( srcMode != SM_I2O ) {
+      MY_STAT(inName, &statBuf);
+      if ( MY_S_ISDIR(statBuf.st_mode) ) {
+         fprintf( stderr,
+                  "%s: Input file %s is a directory.\n",
+                  progName,inName);
+         setExit(1);
+         return;
+      }
+   }
  
     switch ( srcMode ) {
  
@@ -1603,6 +1657,7 @@ void testf ( Char *name )
     }
  
     /*--- Now the input handle is sane.  Do the Biz. ---*/
+   outputHandleJustInCase = NULL;
     allOK = testStream ( inStr );
  
     if (allOK && verbosity >= 1) fprintf ( stderr, "ok\n" );
@@ -1619,7 +1674,7 @@ void license ( void )
      "bzip2, a block-sorting file compressor.  "
      "Version %s.\n"
      "   \n"
-    "   Copyright (C) 1996-2000 by Julian Seward.\n"
+    "   Copyright (C) 1996-2002 by Julian Seward.\n"
      "   \n"
      "   This program is free software; you can redistribute it and/or modify\n"
      "   it under the terms set out in the LICENSE file, which is included\n"
@@ -1658,6 +1713,8 @@ void usage ( Char *fullProgName )
        "   -V --version        display software version & license\n"
        "   -s --small          use less memory (at most 2500k)\n"
        "   -1 .. -9            set block size to 100k .. 900k\n"
+      "   --fast              alias for -1\n"
+      "   --best              alias for -9\n"
        "\n"
        "   If invoked as `bzip2', default action is to compress.\n"
        "              as `bunzip2',  default action is to decompress.\n"
@@ -1666,9 +1723,9 @@ void usage ( Char *fullProgName )
        "   If no file names are given, bzip2 compresses or decompresses\n"
        "   from standard input to standard output.  You can combine\n"
        "   short flags, so `-v -4' means the same as -v4 or -4v, &c.\n"
-#if BZ_UNIX
+#     if BZ_UNIX
        "\n"
-#endif
+#     endif
        ,
  
        BZ2_bzlibVersion(),
@@ -1818,11 +1875,11 @@ IntNative main ( IntNative argc, Char *argv[] )
  
     /*-- Set up signal handlers for mem access errors --*/
     signal (SIGSEGV, mySIGSEGVorSIGBUScatcher);
-#if BZ_UNIX
-#ifndef __DJGPP__
+#  if BZ_UNIX
+#  ifndef __DJGPP__
     signal (SIGBUS,  mySIGSEGVorSIGBUScatcher);
-#endif
-#endif
+#  endif
+#  endif
  
     copyFileName ( inName,  "(none)" );
     copyFileName ( outName, "(none)" );
@@ -1933,6 +1990,8 @@ IntNative main ( IntNative argc, Char *argv[] )
        if (ISFLAG("--exponential"))       workFactor = 1;             else 
        if (ISFLAG("--repetitive-best"))   redundant(aa->name);        else
        if (ISFLAG("--repetitive-fast"))   redundant(aa->name);        else
+      if (ISFLAG("--fast"))              blockSize100k = 1;          else
+      if (ISFLAG("--best"))              blockSize100k = 9;          else
        if (ISFLAG("--verbose"))           verbosity++;                else
        if (ISFLAG("--help"))              { usage ( progName ); exit ( 0 ); }
           else
diff --git a/bzip2.txt b/bzip2.txt

index 4f1ae8620cef98b08f859a0d7afbcc88cbd7d056..6afe3588675e8abd68bb762d783b5b313658695d 100644 (file)
--- a/bzip2.txt
+++ b/bzip2.txt
@@ -1,7 +1,6 @@
  
-
  NAME
-       bzip2, bunzip2 - a block-sorting file compressor, v1.0
+       bzip2, bunzip2 - a block-sorting file compressor, v1.0.2
         bzcat - decompresses files to stdout
         bzip2recover - recovers data from damaged bzip2 files
  
@@ -18,20 +17,20 @@ DESCRIPTION
         sorting text compression algorithm,  and  Huffman  coding.
         Compression  is  generally  considerably  better than that
         achieved by more conventional LZ77/LZ78-based compressors,
-       and  approaches  the performance of the PPM family of sta-
+       and  approaches  the performance of the PPM family of sta
         tistical compressors.
  
         The command-line options are deliberately very similar  to
         those of GNU gzip, but they are not identical.
  
-       bzip2  expects  a list of file names to accompany the com-
+       bzip2  expects  a list of file names to accompany the com
         mand-line flags.  Each file is replaced  by  a  compressed
         version  of  itself,  with  the  name "original_name.bz2".
-       Each compressed file has the same modification date,  per-
-       missions, and, when possible, ownership as the correspond-
+       Each compressed file has the same modification date,  per
+       missions, and, when possible, ownership as the correspond
         ing original, so that these properties  can  be  correctly
         restored  at  decompression  time.   File name handling is
-       naive in the sense that there is no mechanism for preserv-
+       naive in the sense that there is no mechanism for preserv
         ing  original file names, permissions, ownerships or dates
         in filesystems which lack these concepts, or have  serious
         file name length restrictions, such as MS-DOS.
@@ -62,23 +61,23 @@ DESCRIPTION
         guess the name of the original file, and uses the original
         name with .out appended.
  
-       As  with compression, supplying no filenames causes decom-
+       As  with compression, supplying no filenames causes decom
         pression from standard input to standard output.
  
-       bunzip2 will correctly decompress a file which is the con-
+       bunzip2 will correctly decompress a file which is the con
         catenation of two or more compressed files.  The result is
         the concatenation of the corresponding uncompressed files.
         Integrity testing (-t) of concatenated compressed files is
         also supported.
  
         You can also compress or decompress files to the  standard
-       output  by giving the -c flag.  Multiple files may be com-
+       output  by giving the -c flag.  Multiple files may be com
         pressed and decompressed like this.  The resulting outputs
         are  fed  sequentially to stdout.  Compression of multiple
-       files in this manner generates a stream containing  multi-
+       files in this manner generates a stream containing  multi
         ple compressed file representations.  Such a stream can be
         decompressed correctly only  by  bzip2  version  0.9.0  or
-       later.   Earlier  versions of bzip2 will stop after decom-
+       later.   Earlier  versions of bzip2 will stop after decom
         pressing the first file in the stream.
  
         bzcat (or bzip2 -dc) decompresses all specified  files  to
@@ -99,7 +98,7 @@ DESCRIPTION
  
         As a self-check for your  protection,  bzip2  uses  32-bit
         CRCs  to make sure that the decompressed version of a file
-       is identical to the original.  This guards against corrup-
+       is identical to the original.  This guards against corrup
         tion  of  the compressed data, and against undetected bugs
         in bzip2 (hopefully very unlikely).  The chances  of  data
         corruption  going  undetected  is  microscopic,  about one
@@ -127,8 +126,8 @@ OPTIONS
                and forces bzip2 to decompress.
  
         -z --compress
-              The  complement  to -d: forces compression, regard-
-              less of the invokation name.
+              The   complement   to   -d:   forces   compression,
+              regardless of the invocation name.
  
         -t --test
                Check integrity of the specified file(s), but don't
@@ -141,6 +140,11 @@ OPTIONS
                forces bzip2 to break hard links to files, which it
                otherwise wouldn't do.
  
+              bzip2  normally  declines to decompress files which
+              don't have the  correct  magic  header  bytes.   If
+              forced  (-f),  however,  it  will  pass  such files
+              through unmodified.  This is how GNU gzip  behaves.
+
         -k --keep
                Keep  (don't delete) input files during compression
                or decompression.
@@ -167,7 +171,7 @@ OPTIONS
  
         -v --verbose
                Verbose mode -- show the compression ratio for each
-              file  processed.   Further  -v's  increase the ver-
+              file  processed.   Further  -v's  increase the ver
                bosity level, spewing out lots of information which
                is primarily of interest for diagnostic purposes.
  
@@ -175,20 +179,24 @@ OPTIONS
                Display  the  software  version,  license terms and
                conditions.
  
-       -1 to -9
+       -1 (or --fast) to -9 (or --best)
                Set the block size to 100 k, 200 k ..  900  k  when
                compressing.   Has  no  effect  when decompressing.
-              See MEMORY MANAGEMENT below.
+              See MEMORY MANAGEMENT below.  The --fast and --best
+              aliases  are  primarily for GNU gzip compatibility.
+              In particular, --fast doesn't make things  signifi
+              cantly  faster.   And  --best  merely  selects  the
+              default behaviour.
  
         --     Treats all subsequent arguments as file names, even
-              if they start with a dash.  This is so you can han-
+              if they start with a dash.  This is so you can han
                dle files with names beginning  with  a  dash,  for
                example: bzip2 -- -myfilename.
  
         --repetitive-fast --repetitive-best
                These  flags  are  redundant  in versions 0.9.5 and
                above.  They provided some coarse control over  the
-              behaviour  of the sorting algorithm in earlier ver-
+              behaviour  of the sorting algorithm in earlier ver
                sions, which was sometimes useful.  0.9.5 and above
                have  an  improved  algorithm  which  renders these
                flags irrelevant.
@@ -199,7 +207,7 @@ MEMORY MANAGEMENT
         affects  both  the  compression  ratio  achieved,  and the
         amount of memory needed for compression and decompression.
         The  flags  -1  through  -9  specify  the block size to be
-       100,000 bytes through 900,000 bytes (the default)  respec-
+       100,000 bytes through 900,000 bytes (the default)  respec
         tively.   At  decompression  time, the block size used for
         compression is read from  the  header  of  the  compressed
         file, and bunzip2 then allocates itself just enough memory
@@ -227,13 +235,13 @@ MEMORY MANAGEMENT
         bunzip2 will require about 3700 kbytes to decompress.   To
         support decompression of any file on a 4 megabyte machine,
         bunzip2 has an option to  decompress  using  approximately
-       half this amount of memory, about 2300 kbytes.  Decompres-
+       half this amount of memory, about 2300 kbytes.  Decompres
         sion speed is also halved, so you should use  this  option
         only where necessary.  The relevant flag is -s.
  
-       In general, try and use the largest block size memory con-
+       In general, try and use the largest block size memory con
         straints  allow,  since  that  maximises  the  compression
-       achieved.   Compression and decompression speed are virtu-
+       achieved.   Compression and decompression speed are virtu
         ally unaffected by block size.
  
         Another significant point applies to files which fit in  a
@@ -249,11 +257,11 @@ MEMORY MANAGEMENT
  
         Here  is a table which summarises the maximum memory usage
         for different block sizes.  Also  recorded  is  the  total
-       compressed  size for 14 files of the Calgary Text Compres-
+       compressed  size for 14 files of the Calgary Text Compres
         sion Corpus totalling 3,141,622 bytes.  This column  gives
         some  feel  for  how  compression  varies with block size.
         These figures tend to understate the advantage  of  larger
-       block  sizes  for  larger files, since the Corpus is domi-
+       block  sizes  for  larger files, since the Corpus is domi
         nated by smaller files.
  
                    Compress   Decompress   Decompress   Corpus
@@ -272,7 +280,7 @@ MEMORY MANAGEMENT
  
  RECOVERING DATA FROM DAMAGED FILES
         bzip2 compresses files in blocks, usually 900kbytes  long.
-       Each block is handled independently.  If a media or trans-
+       Each block is handled independently.  If a media or trans
         mission error causes a multi-block  .bz2  file  to  become
         damaged,  it  may  be  possible  to  recover data from the
         undamaged blocks in the file.
@@ -289,19 +297,19 @@ RECOVERING DATA FROM DAMAGED FILES
         the integrity of the resulting files, and decompress those
         which are undamaged.
  
-       bzip2recover takes a single argument, the name of the dam-
-       aged file, and writes a number of files "rec0001file.bz2",
-       "rec0002file.bz2", etc, containing the  extracted  blocks.
-       The  output  filenames  are  designed  so  that the use of
-       wildcards in subsequent processing -- for example,  "bzip2
-       -dc   rec*file.bz2 > recovered_data" -- lists the files in
-       the correct order.
+       bzip2recover takes a single argument, the name of the dam
+       aged    file,    and    writes    a    number   of   files
+       "rec00001file.bz2",  "rec00002file.bz2",  etc,  containing
+       the   extracted   blocks.   The   output   filenames   are
+       designed  so  that the use of wildcards in subsequent pro
+       cessing  -- for example, "bzip2 -dc  rec*file.bz2 > recov
+       ered_data" -- processes the files in the correct order.
  
         bzip2recover should be of most use dealing with large .bz2
         files,  as  these will contain many blocks.  It is clearly
         futile to use it on damaged single-block  files,  since  a
-       damaged  block  cannot  be recovered.  If you wish to min-
-       imise any potential data loss through media  or  transmis-
+       damaged  block  cannot  be recovered.  If you wish to min
+       imise any potential data loss through media  or  transmis
         sion errors, you might consider compressing with a smaller
         block size.
  
@@ -315,19 +323,19 @@ PERFORMANCE NOTES
         better  than previous versions in this respect.  The ratio
         between worst-case and average-case compression time is in
         the  region  of  10:1.  For previous versions, this figure
-       was more like 100:1.  You can use the -vvvv option to mon-
+       was more like 100:1.  You can use the -vvvv option to mon
         itor progress in great detail, if you want.
  
         Decompression speed is unaffected by these phenomena.
  
         bzip2  usually  allocates  several  megabytes of memory to
-       operate in, and then charges all over it in a fairly  ran-
-       dom  fashion.   This means that performance, both for com-
+       operate in, and then charges all over it in a fairly  ran
+       dom  fashion.   This means that performance, both for com
         pressing and decompressing, is largely determined  by  the
         speed  at  which  your  machine  can service cache misses.
         Because of this, small changes to the code to  reduce  the
         miss  rate  have  been observed to give disproportionately
-       large performance improvements.  I imagine bzip2 will per-
+       large performance improvements.  I imagine bzip2 will per
         form best on machines with very large caches.
  
  
@@ -337,40 +345,46 @@ CAVEATS
         but  the  details  of  what  the problem is sometimes seem
         rather misleading.
  
-       This manual page pertains to version 1.0 of bzip2.  Com-
+       This manual page pertains to version 1.0.2 of bzip2.  Com
         pressed  data created by this version is entirely forwards
         and  backwards  compatible  with   the   previous   public
-       releases,  versions  0.1pl2, 0.9.0 and 0.9.5, but with the
-       following exception: 0.9.0 and above can correctly  decom-
-       press multiple concatenated compressed files.  0.1pl2 can-
-       not do this; it will stop  after  decompressing  just  the
-       first file in the stream.
-
-       bzip2recover  uses  32-bit integers to represent bit posi-
-       tions in compressed files, so it cannot handle  compressed
-       files  more than 512 megabytes long.  This could easily be
-       fixed.
+       releases,  versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
+       but with the following exception: 0.9.0 and above can cor
+       rectly  decompress multiple concatenated compressed files.
+       0.1pl2 cannot do this; it will  stop  after  decompressing
+       just the first file in the stream.
+
+       bzip2recover  versions  prior  to  this  one,  1.0.2, used
+       32-bit integers to represent bit positions  in  compressed
+       files,  so  it could not handle compressed files more than
+       512 megabytes long.  Version 1.0.2 and above  uses  64-bit
+       ints  on  some platforms which support them (GNU supported
+       targets,  and  Windows).   To  establish  whether  or  not
+       bzip2recover  was  built  with  such  a limitation, run it
+       without arguments.  In any event you can build yourself an
+       unlimited version if you can recompile it with MaybeUInt64
+       set to be an unsigned 64-bit integer.
  
  
  AUTHOR
         Julian Seward, jseward@acm.org.
  
-       http://sourceware.cygnus.com/bzip2
-       http://www.muraroa.demon.co.uk
+       http://sources.redhat.com/bzip2
  
-       The ideas embodied in bzip2 are due to (at least) the fol-
+       The ideas embodied in bzip2 are due to (at least) the fol
         lowing  people: Michael Burrows and David Wheeler (for the
         block sorting transformation), David Wheeler  (again,  for
-       the Huffman coder), Peter Fenwick (for the structured cod-
+       the Huffman coder), Peter Fenwick (for the structured cod
         ing model in the original bzip, and many refinements), and
         Alistair  Moffat,  Radford  Neal  and  Ian Witten (for the
         arithmetic  coder  in  the  original  bzip).   I  am  much
-       indebted for their help, support and advice.  See the man-
+       indebted for their help, support and advice.  See the man
         ual in the source distribution for pointers to sources  of
         documentation.  Christian von Roques encouraged me to look
-       for faster sorting algorithms, so as to speed up  compres-
+       for faster sorting algorithms, so as to speed up  compres
         sion.  Bela Lubkin encouraged me to improve the worst-case
-       compression performance.  Many people sent patches, helped
-       with  portability problems, lent machines, gave advice and
-       were generally helpful.
+       compression performance.  The bz* scripts are derived from
+       those  of GNU gzip.  Many people sent patches, helped with
+       portability problems, lent machines, gave advice and  were
+       generally helpful.
  
diff --git a/bzip2recover.c b/bzip2recover.c

index ba3d175632a77de125e5d2c2a346f1efb1cc27c2..286873b8c07ba8e055950f1245406317a3c381dd 100644 (file)
--- a/bzip2recover.c
+++ b/bzip2recover.c
@@ -9,7 +9,7 @@
    salvage from damaged files created by the accompanying
    bzip2-1.0 program.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -57,6 +57,29 @@
  #include <stdlib.h>
  #include <string.h>
  
+
+/* This program records bit locations in the file to be recovered.
+   That means that if 64-bit ints are not supported, we will not
+   be able to recover .bz2 files over 512MB (2^32 bits) long.
+   On GNU supported platforms, we take advantage of the 64-bit
+   int support to circumvent this problem.  Ditto MSVC.
+
+   This change occurred in version 1.0.2; all prior versions have
+   the 512MB limitation.
+*/
+#ifdef __GNUC__
+   typedef  unsigned long long int  MaybeUInt64;
+#  define MaybeUInt64_FMT "%Lu"
+#else
+#ifdef _MSC_VER
+   typedef  unsigned __int64  MaybeUInt64;
+#  define MaybeUInt64_FMT "%I64u"
+#else
+   typedef  unsigned int   MaybeUInt64;
+#  define MaybeUInt64_FMT "%u"
+#endif
+#endif
+
  typedef  unsigned int   UInt32;
  typedef  int            Int32;
  typedef  unsigned char  UChar;
@@ -66,13 +89,25 @@ typedef  unsigned char  Bool;
  #define False   ((Bool)0)
  
  
-Char inFileName[2000];
-Char outFileName[2000];
-Char progName[2000];
+#define BZ_MAX_FILENAME 2000
+
+Char inFileName[BZ_MAX_FILENAME];
+Char outFileName[BZ_MAX_FILENAME];
+Char progName[BZ_MAX_FILENAME];
+
+MaybeUInt64 bytesOut = 0;
+MaybeUInt64 bytesIn  = 0;
  
-UInt32 bytesOut = 0;
-UInt32 bytesIn  = 0;
  
+/*---------------------------------------------------*/
+/*--- Header bytes                                ---*/
+/*---------------------------------------------------*/
+
+#define BZ_HDR_B 0x42                         /* 'B' */
+#define BZ_HDR_Z 0x5a                         /* 'Z' */
+#define BZ_HDR_h 0x68                         /* 'h' */
+#define BZ_HDR_0 0x30                         /* '0' */
+ 
  
  /*---------------------------------------------------*/
  /*--- I/O errors                                  ---*/
@@ -116,6 +151,23 @@ void mallocFail ( Int32 n )
  }
  
  
+/*---------------------------------------------*/
+void tooManyBlocks ( Int32 max_handled_blocks )
+{
+   fprintf ( stderr,
+             "%s: `%s' appears to contain more than %d blocks\n",
+            progName, inFileName, max_handled_blocks );
+   fprintf ( stderr,
+             "%s: and cannot be handled.  To fix, increase\n",
+             progName );
+   fprintf ( stderr, 
+             "%s: BZ_MAX_HANDLED_BLOCKS in bzip2recover.c, and recompile.\n",
+             progName );
+   exit ( 1 );
+}
+
+
+
  /*---------------------------------------------------*/
  /*--- Bit stream I/O                              ---*/
  /*---------------------------------------------------*/
@@ -254,27 +306,37 @@ Bool endsInBz2 ( Char* name )
  /*---                                             ---*/
  /*---------------------------------------------------*/
  
+/* This logic isn't really right when it comes to Cygwin. */
+#ifdef _WIN32
+#  define  BZ_SPLIT_SYM  '\\'  /* path splitter on Windows platform */
+#else
+#  define  BZ_SPLIT_SYM  '/'   /* path splitter on Unix platform */
+#endif
+
  #define BLOCK_HEADER_HI  0x00003141UL
  #define BLOCK_HEADER_LO  0x59265359UL
  
  #define BLOCK_ENDMARK_HI 0x00001772UL
  #define BLOCK_ENDMARK_LO 0x45385090UL
  
+/* Increase if necessary.  However, a .bz2 file with > 50000 blocks
+   would have an uncompressed size of at least 40GB, so the chances
+   are low you'll need to up this.
+*/
+#define BZ_MAX_HANDLED_BLOCKS 50000
  
-UInt32 bStart[20000];
-UInt32 bEnd[20000];
-UInt32 rbStart[20000];
-UInt32 rbEnd[20000];
+MaybeUInt64 bStart [BZ_MAX_HANDLED_BLOCKS];
+MaybeUInt64 bEnd   [BZ_MAX_HANDLED_BLOCKS];
+MaybeUInt64 rbStart[BZ_MAX_HANDLED_BLOCKS];
+MaybeUInt64 rbEnd  [BZ_MAX_HANDLED_BLOCKS];
  
  Int32 main ( Int32 argc, Char** argv )
  {
     FILE*       inFile;
     FILE*       outFile;
     BitStream*  bsIn, *bsWr;
-   Int32       currBlock, b, wrBlock;
-   UInt32      bitsRead;
-   Int32       rbCtr;
-
+   Int32       b, wrBlock, currBlock, rbCtr;
+   MaybeUInt64 bitsRead;
  
     UInt32      buffHi, buffLo, blockCRC;
     Char*       p;
@@ -282,11 +344,37 @@ Int32 main ( Int32 argc, Char** argv )
     strcpy ( progName, argv[0] );
     inFileName[0] = outFileName[0] = 0;
  
-   fprintf ( stderr, "bzip2recover 1.0: extracts blocks from damaged .bz2 files.\n" );
+   fprintf ( stderr, 
+             "bzip2recover 1.0.2: extracts blocks from damaged .bz2 files.\n" );
  
     if (argc != 2) {
        fprintf ( stderr, "%s: usage is `%s damaged_file_name'.\n",
                          progName, progName );
+      switch (sizeof(MaybeUInt64)) {
+         case 8:
+            fprintf(stderr, 
+                    "\trestrictions on size of recovered file: None\n");
+            break;
+         case 4:
+            fprintf(stderr, 
+                    "\trestrictions on size of recovered file: 512 MB\n");
+            fprintf(stderr, 
+                    "\tto circumvent, recompile with MaybeUInt64 as an\n"
+                    "\tunsigned 64-bit int.\n");
+            break;
+         default:
+            fprintf(stderr, 
+                    "\tsizeof(MaybeUInt64) is not 4 or 8 -- "
+                    "configuration error.\n");
+            break;
+      }
+      exit(1);
+   }
+
+   if (strlen(argv[1]) >= BZ_MAX_FILENAME-20) {
+      fprintf ( stderr, 
+                "%s: supplied filename is suspiciously (>= %d chars) long.  Bye!\n",
+                progName, strlen(argv[1]) );
        exit(1);
     }
  
@@ -316,7 +404,8 @@ Int32 main ( Int32 argc, Char** argv )
              (bitsRead - bStart[currBlock]) >= 40) {
              bEnd[currBlock] = bitsRead-1;
              if (currBlock > 0)
-               fprintf ( stderr, "   block %d runs from %d to %d (incomplete)\n",
+               fprintf ( stderr, "   block %d runs from " MaybeUInt64_FMT 
+                                 " to " MaybeUInt64_FMT " (incomplete)\n",
                           currBlock,  bStart[currBlock], bEnd[currBlock] );
           } else
              currBlock--;
@@ -330,17 +419,22 @@ Int32 main ( Int32 argc, Char** argv )
             ( (buffHi & 0x0000ffff) == BLOCK_ENDMARK_HI 
               && buffLo == BLOCK_ENDMARK_LO)
           ) {
-         if (bitsRead > 49)
-            bEnd[currBlock] = bitsRead-49; else
+         if (bitsRead > 49) {
+            bEnd[currBlock] = bitsRead-49;
+         } else {
              bEnd[currBlock] = 0;
+         }
           if (currBlock > 0 &&
              (bEnd[currBlock] - bStart[currBlock]) >= 130) {
-            fprintf ( stderr, "   block %d runs from %d to %d\n",
+            fprintf ( stderr, "   block %d runs from " MaybeUInt64_FMT 
+                              " to " MaybeUInt64_FMT "\n",
                        rbCtr+1,  bStart[currBlock], bEnd[currBlock] );
              rbStart[rbCtr] = bStart[currBlock];
              rbEnd[rbCtr] = bEnd[currBlock];
              rbCtr++;
           }
+         if (currBlock >= BZ_MAX_HANDLED_BLOCKS)
+            tooManyBlocks(BZ_MAX_HANDLED_BLOCKS);
           currBlock++;
  
           bStart[currBlock] = bitsRead;
@@ -400,10 +494,25 @@ Int32 main ( Int32 argc, Char** argv )
           wrBlock++;
        } else
        if (bitsRead == rbStart[wrBlock]) {
-         outFileName[0] = 0;
-         sprintf ( outFileName, "rec%4d", wrBlock+1 );
-         for (p = outFileName; *p != 0; p++) if (*p == ' ') *p = '0';
-         strcat ( outFileName, inFileName );
+         /* Create the output file name, correctly handling leading paths. 
+            (31.10.2001 by Sergey E. Kusikov) */
+         Char* split;
+         Int32 ofs, k;
+         for (k = 0; k < BZ_MAX_FILENAME; k++) 
+            outFileName[k] = 0;
+         strcpy (outFileName, inFileName);
+         split = strrchr (outFileName, BZ_SPLIT_SYM);
+         if (split == NULL) {
+            split = outFileName;
+         } else {
+            ++split;
+        }
+        /* Now split points to the start of the basename. */
+         ofs  = split - outFileName;
+         sprintf (split, "rec%5d", wrBlock+1);
+         for (p = split; *p != 0; p++) if (*p == ' ') *p = '0';
+         strcat (outFileName, inFileName + ofs);
+
           if ( !endsInBz2(outFileName)) strcat ( outFileName, ".bz2" );
  
           fprintf ( stderr, "   writing block %d to `%s' ...\n",
@@ -416,8 +525,10 @@ Int32 main ( Int32 argc, Char** argv )
              exit(1);
           }
           bsWr = bsOpenWriteStream ( outFile );
-         bsPutUChar ( bsWr, 'B' ); bsPutUChar ( bsWr, 'Z' );
-         bsPutUChar ( bsWr, 'h' ); bsPutUChar ( bsWr, '9' );
+         bsPutUChar ( bsWr, BZ_HDR_B );    
+         bsPutUChar ( bsWr, BZ_HDR_Z );    
+         bsPutUChar ( bsWr, BZ_HDR_h );    
+         bsPutUChar ( bsWr, BZ_HDR_0 + 9 );
           bsPutUChar ( bsWr, 0x31 ); bsPutUChar ( bsWr, 0x41 );
           bsPutUChar ( bsWr, 0x59 ); bsPutUChar ( bsWr, 0x26 );
           bsPutUChar ( bsWr, 0x53 ); bsPutUChar ( bsWr, 0x59 );
diff --git a/bzlib.c b/bzlib.c

index 4a06d9f14b919955fb0c5edf994fc430d8884435..7d1cb275f5b90b7ddaa93d37c736b5abfd674766 100644 (file)
--- a/bzlib.c
+++ b/bzlib.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -93,10 +93,39 @@ void BZ2_bz__AssertH__fail ( int errcode )
        "component, you should also report this bug to the author(s)\n"
        "of that program.  Please make an effort to report this bug;\n"
        "timely and accurate bug reports eventually lead to higher\n"
-      "quality software.  Thanks.  Julian Seward, 21 March 2000.\n\n",
+      "quality software.  Thanks.  Julian Seward, 30 December 2001.\n\n",
        errcode,
        BZ2_bzlibVersion()
     );
+
+   if (errcode == 1007) {
+   fprintf(stderr,
+      "\n*** A special note about internal error number 1007 ***\n"
+      "\n"
+      "Experience suggests that a common cause of i.e. 1007\n"
+      "is unreliable memory or other hardware.  The 1007 assertion\n"
+      "just happens to cross-check the results of huge numbers of\n"
+      "memory reads/writes, and so acts (unintendedly) as a stress\n"
+      "test of your memory system.\n"
+      "\n"
+      "I suggest the following: try compressing the file again,\n"
+      "possibly monitoring progress in detail with the -vv flag.\n"
+      "\n"
+      "* If the error cannot be reproduced, and/or happens at different\n"
+      "  points in compression, you may have a flaky memory system.\n"
+      "  Try a memory-test program.  I have used Memtest86\n"
+      "  (www.memtest86.com).  At the time of writing it is free (GPLd).\n"
+      "  Memtest86 tests memory much more thorougly than your BIOSs\n"
+      "  power-on test, and may find failures that the BIOS doesn't.\n"
+      "\n"
+      "* If the error can be repeatably reproduced, this is a bug in\n"
+      "  bzip2, and I would very much like to hear about it.  Please\n"
+      "  let me know, and, ideally, save a copy of the file causing the\n"
+      "  problem -- without which I will be unable to investigate it.\n"
+      "\n"
+   );
+   }
+
     exit(3);
  }
  #endif
@@ -1402,7 +1431,7 @@ BZFILE * bzopen_or_bzdopen
           smallMode = 1; break;
        default:
           if (isdigit((int)(*mode))) {
-            blockSize100k = *mode-'0';
+            blockSize100k = *mode-BZ_HDR_0;
           }
        }
        mode++;
diff --git a/bzlib.h b/bzlib.h

index c9447a2958767a1c2e9efbbb1c7c99542fdf87ba..9ac43a169da5ee0d09860f7fdae936faea019e85 100644 (file)
--- a/bzlib.h
+++ b/bzlib.h
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -110,8 +110,10 @@ typedef
  #define BZ_EXPORT
  #endif
  
+/* Need a definitition for FILE */
+#include <stdio.h>
+
  #ifdef _WIN32
-#   include <stdio.h>
  #   include <windows.h>
  #   ifdef small
        /* windows.h define small to char */
diff --git a/bzlib_private.h b/bzlib_private.h

index fb51c7a1d4b42dccc85a73591fd6c60a6f55c540..ff973c3bfd0314eeba4d2dd5f620ab880144f90f 100644 (file)
--- a/bzlib_private.h
+++ b/bzlib_private.h
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -76,7 +76,7 @@
  
  /*-- General stuff. --*/
  
-#define BZ_VERSION  "1.0.1, 23-June-2000"
+#define BZ_VERSION  "1.0.2, 30-Dec-2001"
  
  typedef char            Char;
  typedef unsigned char   Bool;
@@ -137,6 +137,13 @@ extern void bz_internal_error ( int errcode );
  #define BZFREE(ppp)  (strm->bzfree)(strm->opaque,(ppp))
  
  
+/*-- Header bytes. --*/
+
+#define BZ_HDR_B 0x42   /* 'B' */
+#define BZ_HDR_Z 0x5a   /* 'Z' */
+#define BZ_HDR_h 0x68   /* 'h' */
+#define BZ_HDR_0 0x30   /* '0' */
+  
  /*-- Constants for the back end. --*/
  
  #define BZ_MAX_ALPHA_SIZE 258
diff --git a/bzmore b/bzmore

new file mode 100644 (file)

index 0000000..d314043
--- /dev/null
+++ b/bzmore
@@ -0,0 +1,61 @@
+#!/bin/sh
+
+# Bzmore wrapped for bzip2, 
+# adapted from zmore by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
+
+PATH="/usr/bin:$PATH"; export PATH
+
+prog=`echo $0 | sed 's|.*/||'`
+case "$prog" in
+       *less)  more=less       ;;
+       *)      more=more       ;;
+esac
+
+if test "`echo -n a`" = "-n a"; then
+  # looks like a SysV system:
+  n1=''; n2='\c'
+else
+  n1='-n'; n2=''
+fi
+oldtty=`stty -g 2>/dev/null`
+if stty -cbreak 2>/dev/null; then
+  cb='cbreak'; ncb='-cbreak'
+else
+  # 'stty min 1' resets eof to ^a on both SunOS and SysV!
+  cb='min 1 -icanon'; ncb='icanon eof ^d'
+fi
+if test $? -eq 0 -a -n "$oldtty"; then
+   trap 'stty $oldtty 2>/dev/null; exit' 0 2 3 5 10 13 15
+else
+   trap 'stty $ncb echo 2>/dev/null; exit' 0 2 3 5 10 13 15
+fi
+
+if test $# = 0; then
+    if test -t 0; then
+       echo usage: $prog files...
+    else
+       bzip2 -cdfq | eval $more
+    fi
+else
+    FIRST=1
+    for FILE
+    do
+       if test $FIRST -eq 0; then
+               echo $n1 "--More--(Next file: $FILE)$n2"
+               stty $cb -echo 2>/dev/null
+               ANS=`dd bs=1 count=1 2>/dev/null` 
+               stty $ncb echo 2>/dev/null
+               echo " "
+               if test "$ANS" = 'e' -o "$ANS" = 'q'; then
+                       exit
+               fi
+       fi
+       if test "$ANS" != 's'; then
+               echo "------> $FILE <------"
+               bzip2 -cdfq "$FILE" | eval $more
+       fi
+       if test -t; then
+               FIRST=0
+       fi
+    done
+fi
diff --git a/bzmore.1 b/bzmore.1

new file mode 100644 (file)

index 0000000..b437d3b
--- /dev/null
+++ b/bzmore.1
@@ -0,0 +1,152 @@
+.\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
+.\"for Debian GNU/Linux
+.TH BZMORE 1
+.SH NAME
+bzmore, bzless \- file perusal filter for crt viewing of bzip2 compressed text
+.SH SYNOPSIS
+.B bzmore
+[ name ...  ]
+.br
+.B bzless
+[ name ...  ]
+.SH NOTE
+In the following description,
+.I bzless
+and
+.I less
+can be used interchangeably with
+.I bzmore
+and
+.I more.
+.SH DESCRIPTION
+.I  Bzmore
+is a filter which allows examination of compressed or plain text files
+one screenful at a time on a soft-copy terminal.
+.I bzmore
+works on files compressed with
+.I bzip2
+and also on uncompressed files.
+If a file does not exist,
+.I bzmore
+looks for a file of the same name with the addition of a .bz2 suffix.
+.PP
+.I Bzmore
+normally pauses after each screenful, printing --More--
+at the bottom of the screen.
+If the user then types a carriage return, one more line is displayed.
+If the user hits a space,
+another screenful is displayed.  Other possibilities are enumerated later.
+.PP
+.I Bzmore
+looks in the file
+.I /etc/termcap
+to determine terminal characteristics,
+and to determine the default window size.
+On a terminal capable of displaying 24 lines,
+the default window size is 22 lines.
+Other sequences which may be typed when
+.I bzmore
+pauses, and their effects, are as follows (\fIi\fP is an optional integer
+argument, defaulting to 1) :
+.PP
+.IP \fIi\|\fP<space>
+display
+.I i
+more lines, (or another screenful if no argument is given)
+.PP
+.IP ^D
+display 11 more lines (a ``scroll'').
+If
+.I i
+is given, then the scroll size is set to \fIi\|\fP.
+.PP
+.IP d
+same as ^D (control-D)
+.PP
+.IP \fIi\|\fPz
+same as typing a space except that \fIi\|\fP, if present, becomes the new
+window size.  Note that the window size reverts back to the default at the
+end of the current file.
+.PP
+.IP \fIi\|\fPs
+skip \fIi\|\fP lines and print a screenful of lines
+.PP
+.IP \fIi\|\fPf
+skip \fIi\fP screenfuls and print a screenful of lines
+.PP
+.IP "q or Q"
+quit reading the current file; go on to the next (if any)
+.PP
+.IP "e or q"
+When the prompt --More--(Next file: 
+.IR file )
+is printed, this command causes bzmore to exit.
+.PP
+.IP s
+When the prompt --More--(Next file: 
+.IR file )
+is printed, this command causes bzmore to skip the next file and continue.
+.PP 
+.IP =
+Display the current line number.
+.PP
+.IP \fIi\|\fP/expr
+search for the \fIi\|\fP-th occurrence of the regular expression \fIexpr.\fP
+If the pattern is not found,
+.I bzmore
+goes on to the next file (if any).
+Otherwise, a screenful is displayed, starting two lines before the place
+where the expression was found.
+The user's erase and kill characters may be used to edit the regular
+expression.
+Erasing back past the first column cancels the search command.
+.PP
+.IP \fIi\|\fPn
+search for the \fIi\|\fP-th occurrence of the last regular expression entered.
+.PP
+.IP !command
+invoke a shell with \fIcommand\|\fP. 
+The character `!' in "command" are replaced with the
+previous shell command.  The sequence "\\!" is replaced by "!".
+.PP
+.IP ":q or :Q"
+quit reading the current file; go on to the next (if any)
+(same as q or Q).
+.PP
+.IP .
+(dot) repeat the previous command.
+.PP
+The commands take effect immediately, i.e., it is not necessary to
+type a carriage return.
+Up to the time when the command character itself is given,
+the user may hit the line kill character to cancel the numerical
+argument being formed.
+In addition, the user may hit the erase character to redisplay the
+--More-- message.
+.PP
+At any time when output is being sent to the terminal, the user can
+hit the quit key (normally control\-\\).
+.I Bzmore
+will stop sending output, and will display the usual --More--
+prompt.
+The user may then enter one of the above commands in the normal manner.
+Unfortunately, some output is lost when this is done, due to the
+fact that any characters waiting in the terminal's output queue
+are flushed when the quit signal occurs.
+.PP
+The terminal is set to
+.I noecho
+mode by this program so that the output can be continuous.
+What you type will thus not show on your terminal, except for the / and !
+commands.
+.PP
+If the standard output is not a teletype, then
+.I bzmore
+acts just like
+.I bzcat,
+except that a header is printed before each file.
+.SH FILES
+.DT
+/etc/termcap           Terminal data base
+.SH "SEE ALSO"
+more(1), less(1), bzip2(1), bzdiff(1), bzgrep(1)
diff --git a/compress.c b/compress.c

index cc5e31d6f0ec654641e038277d5d89581f1f0cbf..56501c1155335fcdfa8d151a9454155c4f98306b 100644 (file)
--- a/compress.c
+++ b/compress.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -663,10 +663,10 @@ void BZ2_compressBlock ( EState* s, Bool is_last_block )
     /*-- If this is the first block, create the stream header. --*/
     if (s->blockNo == 1) {
        BZ2_bsInitWrite ( s );
-      bsPutUChar ( s, 'B' );
-      bsPutUChar ( s, 'Z' );
-      bsPutUChar ( s, 'h' );
-      bsPutUChar ( s, (UChar)('0' + s->blockSize100k) );
+      bsPutUChar ( s, BZ_HDR_B );
+      bsPutUChar ( s, BZ_HDR_Z );
+      bsPutUChar ( s, BZ_HDR_h );
+      bsPutUChar ( s, (UChar)(BZ_HDR_0 + s->blockSize100k) );
     }
  
     if (s->nblock > 0) {
diff --git a/crctable.c b/crctable.c

index 61c040c4fcfc9c337c39a2e1e0cda0e3dbd8776e..b16746ae19423480b225756973542d531957ab3d 100644 (file)
--- a/crctable.c
+++ b/crctable.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
diff --git a/decompress.c b/decompress.c

index cdced188906179731a341da83ede4d974b4789b8..e9213473acf452ed40f4123db0987824ca1afe64 100644 (file)
--- a/decompress.c
+++ b/decompress.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
@@ -235,18 +235,18 @@ Int32 BZ2_decompress ( DState* s )
     switch (s->state) {
  
        GET_UCHAR(BZ_X_MAGIC_1, uc);
-      if (uc != 'B') RETURN(BZ_DATA_ERROR_MAGIC);
+      if (uc != BZ_HDR_B) RETURN(BZ_DATA_ERROR_MAGIC);
  
        GET_UCHAR(BZ_X_MAGIC_2, uc);
-      if (uc != 'Z') RETURN(BZ_DATA_ERROR_MAGIC);
+      if (uc != BZ_HDR_Z) RETURN(BZ_DATA_ERROR_MAGIC);
  
        GET_UCHAR(BZ_X_MAGIC_3, uc)
-      if (uc != 'h') RETURN(BZ_DATA_ERROR_MAGIC);
+      if (uc != BZ_HDR_h) RETURN(BZ_DATA_ERROR_MAGIC);
  
        GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
-      if (s->blockSize100k < '1' || 
-          s->blockSize100k > '9') RETURN(BZ_DATA_ERROR_MAGIC);
-      s->blockSize100k -= '0';
+      if (s->blockSize100k < (BZ_HDR_0 + 1) || 
+          s->blockSize100k > (BZ_HDR_0 + 9)) RETURN(BZ_DATA_ERROR_MAGIC);
+      s->blockSize100k -= BZ_HDR_0;
  
        if (s->smallDecompress) {
           s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
diff --git a/dlltest.c b/dlltest.c

index f79279cef846649a97425a0e0b3eb890da8929aa..2d7dcca4cb595f3be13f21bbe5a65c1d255cb1dd 100644 (file)
--- a/dlltest.c
+++ b/dlltest.c
@@ -19,7 +19,7 @@
  \r
  #ifdef _WIN32\r
  \r
-#define BZ2_LIBNAME "libbz2-1.0.0.DLL" \r
+#define BZ2_LIBNAME "libbz2-1.0.2.DLL" \r
  \r
  #include <windows.h>\r
  static int BZ2DLLLoaded = 0;\r
@@ -130,8 +130,8 @@ int main(int argc,char *argv[])
           }else{\r
              fp_w = stdout;\r
           }\r
-         if((BZ2fp_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)\r
-            || (BZ2fp_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){\r
+         if((fn_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)\r
+            || (fn_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){\r
              printf("can't bz2openstream\n");\r
              exit(1);\r
           }\r
diff --git a/huffman.c b/huffman.c

index 9b446c4b36d15356ad86fbe24951e048caf965ef..293095c170c0d1474760088a528da55e591ca335 100644 (file)
--- a/huffman.c
+++ b/huffman.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
diff --git a/makefile.msc b/makefile.msc

index 3fe42324aca188b1482c9af84cdb41574c1121d7..799a18a5f1a7b084274ce1cd63997aeb4f9c2688 100644 (file)
--- a/makefile.msc
+++ b/makefile.msc
@@ -4,7 +4,7 @@
  # Fixed up by JRS for bzip2-0.9.5d release.\r
  \r
  CC=cl\r
-CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64\r
+CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64 -nologo\r
  \r
  OBJS= blocksort.obj  \\r
        huffman.obj    \\r
diff --git a/manual.texi b/manual.texi

index 336776ab805a083f53ebd9c11a014a20823da701..5bc27d5f9f90abcef0788b9a47eac0511abb9a1c 100644 (file)
--- a/manual.texi
+++ b/manual.texi
@@ -2,10 +2,10 @@
  @setfilename bzip2.info
  
  @ignore
-This file documents bzip2 version 1.0, and associated library
+This file documents bzip2 version 1.0.2, and associated library
  libbzip2, written by Julian Seward (jseward@acm.org).
  
-Copyright (C) 1996-2000 Julian R Seward
+Copyright (C) 1996-2002 Julian R Seward
  
  Permission is granted to make and distribute verbatim copies of
  this manual provided the copyright notice and this permission notice
@@ -30,8 +30,8 @@ END-INFO-DIR-ENTRY
  @titlepage
  @title bzip2 and libbzip2
  @subtitle a program and library for data compression
-@subtitle copyright (C) 1996-2000 Julian Seward
-@subtitle version 1.0 of 21 March 2000
+@subtitle copyright (C) 1996-2002 Julian Seward
+@subtitle version 1.0.2 of 30 December 2001
  @author Julian Seward
  
  @end titlepage
@@ -40,11 +40,17 @@ END-INFO-DIR-ENTRY
  @parskip 2mm
  
  @end iftex
-@node Top, Overview, (dir), (dir)
+@node Top,,, (dir)
+
+The following text is the License for this software.  You should
+find it identical to that contained in the file LICENSE in the 
+source distribution.
+
+@bf{------------------ START OF THE LICENSE ------------------}
  
  This program, @code{bzip2}, 
  and associated library @code{libbzip2}, are
-Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
  Redistribution and use in source and binary forms, with or without
  modification, are permitted provided that the following conditions
@@ -82,13 +88,15 @@ Julian Seward, Cambridge, UK.
  
  @code{jseward@@acm.org}
  
-@code{http://sourceware.cygnus.com/bzip2}
+@code{bzip2}/@code{libbzip2} version 1.0.2 of 30 December 2001.
  
-@code{http://www.cacheprof.org}
+@bf{------------------ END OF THE LICENSE ------------------}
  
-@code{http://www.muraroa.demon.co.uk}
+Web sites:
  
-@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000.
+@code{http://sources.redhat.com/bzip2}
+
+@code{http://www.cacheprof.org}
  
  PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented
  algorithms.  However, I do not have the resources available to carry out
@@ -101,7 +109,6 @@ above statement.
  
  
  
-@node Overview, Implementation, Top, Top
  @chapter Introduction
  
  @code{bzip2}  compresses  files  using the Burrows-Wheeler 
@@ -134,7 +141,7 @@ and nothing else.
  @unnumberedsubsubsec NAME
  @itemize
  @item @code{bzip2}, @code{bunzip2}
-- a block-sorting file compressor, v1.0
+- a block-sorting file compressor, v1.0.2
  @item @code{bzcat} 
  - decompresses files to stdout
  @item @code{bzip2recover}
@@ -264,6 +271,11 @@ This really performs a trial decompression and throws away the result.
  Force overwrite of output files.  Normally, @code{bzip2} will not overwrite
  existing output files.  Also forces @code{bzip2} to break hard links
  to files, which it otherwise wouldn't do.
+
+@code{bzip2} normally declines to decompress files which don't have the
+correct magic header bytes.  If forced (@code{-f}), however, it will
+pass such files through unmodified.  This is how GNU @code{gzip}
+behaves.
  @item -k --keep
  Keep (don't delete) input files during compression
  or decompression.
@@ -286,9 +298,13 @@ Further @code{-v}'s increase the verbosity level, spewing out lots of
  information which is primarily of interest for diagnostic purposes.
  @item -L --license -V --version
  Display the software version, license terms and conditions.
-@item -1 to -9
+@item -1 (or --fast) to -9 (or --best)
  Set the block size to 100 k, 200 k ..  900 k when compressing.  Has no
  effect when decompressing.  See MEMORY MANAGEMENT below.
+The @code{--fast} and @code{--best} aliases are primarily for GNU
+@code{gzip} compatibility.  In particular, @code{--fast} doesn't make
+things significantly faster.  And @code{--best} merely selects the
+default behaviour.
  @item --
  Treats all subsequent arguments as file names, even if they start
  with a dash.  This is so you can handle files with names beginning
@@ -389,21 +405,19 @@ integrity of the resulting files, and decompress those which are
  undamaged.
  
  @code{bzip2recover} 
-takes a single argument, the name of the damaged file, 
-and writes a number of files @code{rec0001file.bz2},
-       @code{rec0002file.bz2}, etc, containing the  extracted  blocks.
-       The  output  filenames  are  designed  so  that the use of
-       wildcards in subsequent processing -- for example,  
-@code{bzip2 -dc  rec*file.bz2 > recovered_data} -- lists the files in
-       the correct order.
+takes a single argument, the name of the damaged file, and writes a
+number of files @code{rec00001file.bz2}, @code{rec00002file.bz2}, etc,
+containing the extracted blocks.  The output filenames are designed so
+that the use of wildcards in subsequent processing -- for example,
+@code{bzip2 -dc rec*file.bz2 > recovered_data} -- processes the files in
+the correct order.
  
  @code{bzip2recover} should be of most use dealing with large @code{.bz2}
-       files,  as  these will contain many blocks.  It is clearly
-       futile to use it on damaged single-block  files,  since  a
-       damaged  block  cannot  be recovered.  If you wish to minimise 
-any potential data loss through media  or  transmission errors, 
-you might consider compressing with a smaller
-       block size.
+files, as these will contain many blocks.  It is clearly futile to use
+it on damaged single-block files, since a damaged block cannot be
+recovered.  If you wish to minimise any potential data loss through
+media or transmission errors, you might consider compressing with a
+smaller block size.
  
  
  @unnumberedsubsubsec PERFORMANCE NOTES
@@ -435,22 +449,31 @@ I/O error messages are not as helpful as they could be.  @code{bzip2}
  tries hard to detect I/O errors and exit cleanly, but the details of
  what the problem is sometimes seem rather misleading.
  
-This manual page pertains to version 1.0 of @code{bzip2}.  Compressed
+This manual page pertains to version 1.0.2 of @code{bzip2}.  Compressed
  data created by this version is entirely forwards and backwards
-compatible with the previous public releases, versions 0.1pl2, 0.9.0 and
-0.9.5, but with the following exception: 0.9.0 and above can correctly
-decompress multiple concatenated compressed files.  0.1pl2 cannot do
-this; it will stop after decompressing just the first file in the
-stream.
+compatible with the previous public releases, versions 0.1pl2, 0.9.0,
+0.9.5, 1.0.0 and 1.0.1, but with the following exception: 0.9.0 and
+above can correctly decompress multiple concatenated compressed files.
+0.1pl2 cannot do this; it will stop after decompressing just the first
+file in the stream.
+
+@code{bzip2recover} versions prior to this one, 1.0.2, used 32-bit
+integers to represent bit positions in compressed files, so it could not
+handle compressed files more than 512 megabytes long.  Version 1.0.2 and
+above uses 64-bit ints on some platforms which support them (GNU
+supported targets, and Windows).  To establish whether or not
+@code{bzip2recover} was built with such a limitation, run it without
+arguments.  In any event you can build yourself an unlimited version if
+you can recompile it with @code{MaybeUInt64} set to be an unsigned
+64-bit integer.
  
-@code{bzip2recover} uses 32-bit integers to represent bit positions in
-compressed files, so it cannot handle compressed files more than 512
-megabytes long.  This could easily be fixed.
  
  
  @unnumberedsubsubsec AUTHOR
  Julian Seward, @code{jseward@@acm.org}.
  
+@code{http://sources.redhat.com/bzip2}
+
  The ideas embodied in @code{bzip2} are due to (at least) the following
  people: Michael Burrows and David Wheeler (for the block sorting
  transformation), David Wheeler (again, for the Huffman coder), Peter
@@ -461,8 +484,9 @@ indebted for their help, support and advice.  See the manual in the
  source distribution for pointers to sources of documentation.  Christian
  von Roques encouraged me to look for faster sorting algorithms, so as to
  speed up compression.  Bela Lubkin encouraged me to improve the
-worst-case compression performance.  Many people sent patches, helped
-with portability problems, lent machines, gave advice and were generally
+worst-case compression performance.  The @code{bz*} scripts are derived
+from those of GNU @code{gzip}.  Many people sent patches, helped with
+portability problems, lent machines, gave advice and were generally
  helpful.
  
  @end quotation
@@ -1769,16 +1793,20 @@ was compiled with @code{BZ_NO_STDIO} set.
  For a normal compile, an assertion failure yields the message
  @example
     bzip2/libbzip2: internal error number N.
-   This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000.
+   This is a bug in bzip2/libbzip2, 1.0.2, 30-Dec-2001.
     Please report it to me at: jseward@@acm.org.  If this happened
     when you were using some program which uses libbzip2 as a
     component, you should also report this bug to the author(s)
     of that program.  Please make an effort to report this bug;
     timely and accurate bug reports eventually lead to higher
-   quality software.  Thanks.  Julian Seward, 21 March 2000.
+   quality software.  Thanks.  Julian Seward, 30 December 2001.
  @end example
-where @code{N} is some error code number.  @code{exit(3)}
-is then called.
+where @code{N} is some error code number.  If @code{N == 1007}, it also
+prints some extra text advising the reader that unreliable memory is
+often associated with internal error 1007.  (This is a
+frequently-observed-phenomenon with versions 1.0.0/1.0.1).
+
+@code{exit(3)} is then called.
  
  For a @code{stdio}-free library, assertion failures result
  in a call to a function declared as:
@@ -2056,10 +2084,10 @@ Maybe this isn't what you want.
  If you want a compressor and/or library which is faster, uses less
  memory but gets pretty good compression, and has minimal latency,
  consider Jean-loup
-Gailly's and Mark Adler's work, @code{zlib-1.1.2} and
+Gailly's and Mark Adler's work, @code{zlib-1.1.3} and
  @code{gzip-1.2.4}.  Look for them at
  
-@code{http://www.cdrom.com/pub/infozip/zlib} and
+@code{http://www.zlib.org} and
  @code{http://www.gzip.org} respectively.
  
  For something faster and lighter still, you might try Markus F X J
diff --git a/mk251.c b/mk251.c

new file mode 100644 (file)

index 0000000..205778a
--- /dev/null
+++ b/mk251.c
@@ -0,0 +1,16 @@
+
+/* Spew out a long sequence of the byte 251.  When fed to bzip2
+   versions 1.0.0 or 1.0.1, causes it to die with internal error
+   1007 in blocksort.c.  This assertion misses an extremely rare
+   case, which is fixed in this version (1.0.2) and above.
+*/
+
+#include <stdio.h>
+
+int main ()
+{
+   int i;
+   for (i = 0; i < 48500000 ; i++)
+     putchar(251);
+   return 0;
+}
diff --git a/randtable.c b/randtable.c

index 983089d4684dcc1d3d44bf422975bbc02bd80ccf..5c922e94f3c55ef29625addd82657c54b4183847 100644 (file)
--- a/randtable.c
+++ b/randtable.c
@@ -8,7 +8,7 @@
    This file is a part of bzip2 and/or libbzip2, a program and
    library for lossless, block-sorting data compression.
  
-  Copyright (C) 1996-2000 Julian R Seward.  All rights reserved.
+  Copyright (C) 1996-2002 Julian R Seward.  All rights reserved.
  
    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
diff --git a/words3 b/words3

index 8486a84c8b39e690a79cc390ae8af81435ce721e..7a6b462443cf32b01e2b51964ef33b7aff77ecad 100644 (file)
--- a/words3
+++ b/words3
@@ -15,8 +15,8 @@ not actually execute them.
  
  Instructions for use are in the preformatted manual page, in the file
  bzip2.txt.  For more detailed documentation, read the full manual.  
-It is available in Postscript form (manual.ps) and HTML form
-(manual_toc.html).
+It is available in Postscript form (manual.ps), PDF form (manual.pdf),
+and HTML form (manual_toc.html).
  
  You can also do "bzip2 --help" to see some helpful information. 
  "bzip2 -L" displays the software license.
author	Julian Seward <jseward@acm.org>
	Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)
committer	Julian Seward <jseward@acm.org>
	Sun, 30 Dec 2001 21:13:13 +0000 (22:13 +0100)
CHANGES		patch \| blob \| blame \| history
LICENSE		patch \| blob \| blame \| history
Makefile		patch \| blob \| blame \| history
Makefile-libbz2_so		patch \| blob \| blame \| history
README		patch \| blob \| blame \| history
README.COMPILATION.PROBLEMS		patch \| blob \| blame \| history
blocksort.c		patch \| blob \| blame \| history
bzdiff	[new file with mode: 0644]	patch \| blob
bzdiff.1	[new file with mode: 0644]	patch \| blob
bzgrep	[new file with mode: 0644]	patch \| blob
bzgrep.1	[new file with mode: 0644]	patch \| blob
bzip2.1		patch \| blob \| blame \| history
bzip2.1.preformatted		patch \| blob \| blame \| history
bzip2.c		patch \| blob \| blame \| history
bzip2.txt		patch \| blob \| blame \| history
bzip2recover.c		patch \| blob \| blame \| history
bzlib.c		patch \| blob \| blame \| history
bzlib.h		patch \| blob \| blame \| history
bzlib_private.h		patch \| blob \| blame \| history
bzmore	[new file with mode: 0644]	patch \| blob
bzmore.1	[new file with mode: 0644]	patch \| blob
compress.c		patch \| blob \| blame \| history
crctable.c		patch \| blob \| blame \| history
decompress.c		patch \| blob \| blame \| history
dlltest.c		patch \| blob \| blame \| history
huffman.c		patch \| blob \| blame \| history
makefile.msc		patch \| blob \| blame \| history
manual.texi		patch \| blob \| blame \| history
mk251.c	[new file with mode: 0644]	patch \| blob
randtable.c		patch \| blob \| blame \| history
words3		patch \| blob \| blame \| history