Bug 13534 - ar mishandles files bigger than 2GB
Summary: ar mishandles files bigger than 2GB
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-21 13:10 UTC by Francois Gouget
Modified: 2012-01-20 14:45 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
bfd: Fix writing the size of 2+GB elements in the archive. (1.32 KB, patch)
2011-12-21 23:07 UTC, Francois Gouget
Details | Diff
bfd: Refuse to create an invalid archive when an archive element is too big. (1.53 KB, patch)
2011-12-21 23:35 UTC, Francois Gouget
Details | Diff
bfd: Fix parsing the size of archive elements larger than 2GB. (500 bytes, text/x-c++)
2011-12-21 23:35 UTC, Francois Gouget
Details
bfd: Always use bfd_size_type to manipulate the size of an archive element. (1.09 KB, patch)
2011-12-21 23:36 UTC, Francois Gouget
Details | Diff
ar: Fix handling of archive elements larger than 2GB. (710 bytes, patch)
2011-12-21 23:36 UTC, Francois Gouget
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Francois Gouget 2011-12-21 13:10:28 UTC
Here is how to reproduce the problem:

$ dd if=/dev/zero of=file2G bs=1M count=2049
2049+0 records in
2049+0 records out
2148532224 bytes (2.1 GB) copied, 47.9032 s, 44.9 MB/s
$ ar q ar2G.ar file2G
$ od -a ar2G.ar
0000000   !   <   a   r   c   h   >  nl   f   i   l   e   2   G   /  sp
0000020  sp  sp  sp  sp  sp  sp  sp  sp   1   3   2   4   4   6   6   2
0000040   8   0  sp  sp   1   0   0   0  sp  sp   1   0   0   0  sp  sp
0000060   1   0   0   6   4   4  sp  sp   -   2   1   4   6   4   3   5
0000100   0   7   `  nl nul nul nul nul nul nul nul nul nul nul nul nul
^C

Note that the archive claims that the 'file2G' size is negative: 
-214643507. This results in an invalid archive that cannot be extracted:

$ ar xv ar2G.ar
x - file2G
ar: ar2G.ar is not a valid archive

As a consequence of this it is impossible to generate Debian packages 
bigger than 2GB (for instance for applications that have a large 
dataset).

Obviously the file size was stored into a signed 32bit variable. Reading 
the source code shows that it was actually a long which means there will 
be further issues if such an archive is moved from a 32bit system to a 
64bit one.

More precisely, the archive file format is a linked list of element 
headers and relies on these having accurate size information to find the 
position of the next element header. Since the archive format allocates 
10 characters for the file size, it should be able to handle files up to 
10GB. However:

 * Files between 2GiB and 4GiB
   The file size is stored as being negative. The archive cannot be 
   extracted by either the 32bit ar or the 64bit one.

 * Files between 4GiB and 10GB
   Only the first 32bits are taken into account so ar will write a size 
   of 0.1GiB for a 4.1GiB file. As a result, during extraction ar will 
   think there is an archive element header in the middle of the file, 
   resulting in an error (if not worse).
   There are also sign issues between 6GiB and 8GiB.

 * Files bigger than 10GB
   ar will silently truncated the file size to its first 10 decimal 
   digits. Decoding will fail for the same reason as above.

Even 64bit systems are not immune to these issues due to the file sizes 
being stored in 'unsigned int' variables in various places.
Comment 1 Francois Gouget 2011-12-21 23:07:28 UTC
Created attachment 6123 [details]
bfd: Fix writing the size of 2+GB elements in the archive.
Comment 2 Francois Gouget 2011-12-21 23:35:17 UTC
Created attachment 6124 [details]
bfd: Refuse to create an invalid archive when an archive element is too big.
Comment 3 Francois Gouget 2011-12-21 23:35:44 UTC
Created attachment 6125 [details]
bfd: Fix parsing the size of archive elements larger than 2GB.
Comment 4 Francois Gouget 2011-12-21 23:36:16 UTC
Created attachment 6126 [details]
bfd: Always use bfd_size_type to manipulate the size of an archive element.
Comment 5 Francois Gouget 2011-12-21 23:36:40 UTC
Created attachment 6127 [details]
ar: Fix handling of archive elements larger than 2GB.
Comment 6 Francois Gouget 2011-12-21 23:54:57 UTC
I attached a set of 5 patches that fix this issue (at least for me). I hope they're ok. If not let me know.
Comment 7 Sourceware Commits 2012-01-20 14:43:02 UTC
CVSROOT:	/cvs/src
Module name:	src
Changes by:	nickc@sourceware.org	2012-01-20 14:42:57

Modified files:
	bfd            : ChangeLog archive.c archive64.c bfdio.c 
	                 libbfd-in.h libbfd.h 

Log message:
	PR binutils/13534
	* archive.c (_bfd_ar_sizepad): New function. Correctly install and
	pad the size field in an archive header.
	(_bfd_generic_read_ar_hdr_mag): Use the correct type and scan
	function for the archive size field.
	(bfd_generic_openr_next_archived_file): Likewise.
	(do_slurp_coff_armap): Likewise.
	(_bfd_write_archive_contents): Likewise.
	(_bfd_bsd44_write_ar_hdr): Use the new function.
	(bfd_ar_hdr_from_filesystem): Likewise.
	(_bfd_write_archive_contents): Likewise.
	(bsd_write_armap): Likewise.
	(coff_write_armap): Likewise.
	* archive64.c (bfd_elf64_archive_write_armap): Likewise.
	* bfdio.c (bfd_bread): Use correct type for archive element
	sizes.
	* ar.c (open_inarch): Likewise.
	(extract_file): Likewise.
	* libbfd-in.h (struct areltdata): Use correct types for
	parsed_size and extra_size fields.
	Prototype _bfd_ar_sizepad function.
	* libbfd.h: Regenerate.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.5591&r2=1.5592
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/archive.c.diff?cvsroot=src&r1=1.80&r2=1.81
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/archive64.c.diff?cvsroot=src&r1=1.14&r2=1.15
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/bfdio.c.diff?cvsroot=src&r1=1.31&r2=1.32
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/libbfd-in.h.diff?cvsroot=src&r1=1.95&r2=1.96
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/libbfd.h.diff?cvsroot=src&r1=1.265&r2=1.266
Comment 8 Nick Clifton 2012-01-20 14:45:18 UTC
Hi Francois,

  Thanks for reporting this problem, and supplying a patch to fix it.

  I have checked in your patch with minor change - I made _bfd_ar_sizepad a boolean function - and one slightly  more important change - I created a changelog entry.

Cheers
  Nick