Bug 17819 - gold crashes when generating build-id for a large .so
Summary: gold crashes when generating build-id for a large .so
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gold (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: ---
Assignee: Cary Coutant
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-08 20:00 UTC by Guillaume Morin
Modified: 2015-06-02 16:48 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Guillaume Morin 2015-01-08 20:00:21 UTC
Tested against Debian 2.24 binutils.  But it seems the bug is still tehre in 2.25 and 2.26

This happens when I generate a large so file with ld.gold and passing
--build-id.  This does not seem to happen with wheezy but I did not
investigate too much.

The sequence of event is pretty simple.  gold creates a md5 task for its
workqueue by binding a pointer to mmap'ed address of the output file.
Unfortunately, the file is resized and re-mmaped to a different
address later so when the hash_task is run, the src pointer pointing to
unmapped memory and generating a crash.

This is a full gdb session showing the crash:

The OutputFile object is opened with a size of 46793152:

Breakpoint 3, gold::Output_file::open (this=0xd31bed0,
file_size=46793152) at ../../gold/output.cc:5001
(gdb) print * this
$3 = {name_ = 0x7fffffffde6a "/usr/scratch/tmp/a.out", o_ = -1,
file_size_ = 0, base_ = 0x0, map_is_anonymous_ = false,
map_is_allocated_ = false, is_temporary_ = false}
(gdb) bt
#0  gold::Output_file::open (this=0xd31bed0, file_size=46793152) at
../../gold/output.cc:5001
#1  0x00000000004be203 in gold::Layout_task_runner::run (this=0xab83ec0,
workqueue=0x7fffffff6fe0, task=0xab83fc0) at ../../gold/layout.cc:395
#2  0x000000000048ada1 in gold::Task_function::run (this=0xab83fc0,
workqueue=0x7fffffff6fe0) at ../../gold/workqueue.h:178
#3  0x00000000005d6242 in gold::Workqueue::find_and_run_task
(this=0x7fffffff6fe0, thread_number=0) at ../../gold/workqueue.cc:319
#4  0x00000000005d688c in gold::Workqueue::process (this=0x7fffffff6fe0,
thread_number=0) at ../../gold/workqueue.cc:495
#5  0x0000000000404a00 in main (argc=136, argv=0x7fffffffd6b8) at
../../gold/main.cc:252

And mapped right after:

Breakpoint 4, gold::Output_file::map (this=0xd31bed0) at
../../gold/output.cc:5168
(gdb) bt
#0  gold::Output_file::map (this=0xd31bed0) at ../../gold/output.cc:5168
#1  0x000000000052205c in gold::Output_file::open (this=0xd31bed0,
file_size=46793152) at ../../gold/output.cc:5050
#2  0x00000000004be203 in gold::Layout_task_runner::run (this=0xab83ec0,
workqueue=0x7fffffff6fe0, task=0xab83fc0) at ../../gold/layout.cc:395
#3  0x000000000048ada1 in gold::Task_function::run (this=0xab83fc0,
workqueue=0x7fffffff6fe0) at ../../gold/workqueue.h:178
#4  0x00000000005d6242 in gold::Workqueue::find_and_run_task
(this=0x7fffffff6fe0, thread_number=0) at ../../gold/workqueue.cc:319
#5  0x00000000005d688c in gold::Workqueue::process (this=0x7fffffff6fe0,
thread_number=0) at ../../gold/workqueue.cc:495
#6  0x0000000000404a00 in main (argc=136, argv=0x7fffffffd6b8) at
../../gold/main.cc:252
(gdb) p file_size_ 
$4 = 46793152
(gdb) p base_
$5 = (unsigned char *) 0x0
(gdb) fin
Run till exit from #0  gold::Output_file::map (this=0xd31bed0) at
../../gold/output.cc:5168
0x000000000052205c in gold::Output_file::open (this=0xd31bed0,
file_size=46793152) at ../../gold/output.cc:5050
5050	in ../../gold/output.cc
(gdb) p base_
$6 = (unsigned char *) 0x7fff6a983000 ""
(gdb)

A bit later, the workqueue item for md5'ing this file is built:

Breakpoint 1, gold::Layout::queue_build_id_tasks (this=0x7fffffff7590,
workqueue=0x7fffffff6fe0, build_id_blocker=0xd286490, of=0xd31bed0) at
../../gold/layout.cc:5384
(gdb) bt
#0  gold::Layout::queue_build_id_tasks (this=0x7fffffff7590,
workqueue=0x7fffffff6fe0, build_id_blocker=0xd286490, of=0xd31bed0) at
../../gold/layout.cc:5384
#1  0x000000000048a7af in gold::queue_final_tasks (options=...,
input_objects=0x7fffffff7100, symtab=0x7fffffff7330,
layout=0x7fffffff7590, workqueue=0x7fffffff6fe0, of=0xd31bed0)
at ../../gold/gold.cc:880
#2  0x00000000004be2a6 in gold::Layout_task_runner::run
(this=0xab83ec0, workqueue=0x7fffffff6fe0, task=0xab83fc0) at
../../gold/layout.cc:416
#3  0x000000000048ada1 in gold::Task_function::run (this=0xab83fc0,
workqueue=0x7fffffff6fe0) at ../../gold/workqueue.h:178
#4  0x00000000005d6242 in gold::Workqueue::find_and_run_task
(this=0x7fffffff6fe0, thread_number=0) at
../../gold/workqueue.cc:319
#5  0x00000000005d688c in gold::Workqueue::process
(this=0x7fffffff6fe0, thread_number=0) at
../../gold/workqueue.cc:495
#6  0x0000000000404a00 in main (argc=136, argv=0x7fffffffd6b8) at
../../gold/main.cc:252
(gdb) print *of
$8 = {name_ = 0x7fffffffde6a "/usr/scratch/tmp/a.out", o_ = 660,
file_size_ = 46793152, base_ = 0x7fff6a983000 "", map_is_anonymous_ =
false, map_is_allocated_ = false, 
  is_temporary_ = false}
(gdb) print src
$9 = (const unsigned char *) 0x7fff6a983000 ""
(gdb) print src_offset 
$10 = 0

So far so good it seems, but the problem is the file is resized after that:

Breakpoint 5, gold::Output_file::resize (this=0xd31bed0,
file_size=241002406) at ../../gold/output.cc:5061
(gdb) bt
#0  gold::Output_file::resize (this=0xd31bed0, file_size=241002406) at
../../gold/output.cc:5061
#1  0x00000000004c95f8 in
gold::Layout::write_sections_after_input_sections (this=0x7fffffff7590,
of=0xd31bed0) at ../../gold/layout.cc:5336
#2  0x00000000004ca35b in gold::Write_after_input_sections_task::run
(this=0xd2864c0) at ../../gold/layout.cc:5629
#3  0x00000000005d6242 in gold::Workqueue::find_and_run_task
(this=0x7fffffff6fe0, thread_number=0) at ../../gold/workqueue.cc:319
#4  0x00000000005d688c in gold::Workqueue::process (this=0x7fffffff6fe0,
thread_number=0) at ../../gold/workqueue.cc:495
#5  0x0000000000404a00 in main (argc=136, argv=0x7fffffffd6b8) at
../../gold/main.cc:252
(gdb) print file_size_
$11 = 46793152
(gdb) fin
Run till exit from #0  gold::Output_file::resize (this=0xd31bed0,
file_size=241002406) at ../../gold/output.cc:5086
gold::Layout::write_sections_after_input_sections (this=0x7fffffff7590,
of=0xd31bed0) at ../../gold/layout.cc:5337
(gdb) print of->file_size_
$12 = 241002406
(gdb) print of->base_
$13 = (unsigned char *) 0x7fff23e91000 "\177ELF\002\001\001"

But at that point, the old address is still attached to the work queue,
triggering a crash when trying to run the md5 task:

Program received signal SIGSEGV, Segmentation fault.
0x000000000063d250 in md5_process_block (buffer=<optimized out>,
len=len@entry=2097152, ctx=ctx@entry=0x7fffffff6ca0) at
../../libiberty/md5.c:336
(gdb) x /i $rip
=> 0x63d250 <md5_process_block+128>:	mov    0x0(%rbp),%r11d
(gdb) p $rbp
$14 = (void *) 0x7fff6a983000
(gdb)

This can be worked around by using other --build-id options
Comment 1 Cary Coutant 2015-01-08 21:08:45 UTC
Thanks for the excellent analysis!

I'm guessing you're also using --compress-debug-sections. Turning that off would also be another workaround.
Comment 2 Guillaume Morin 2015-01-08 21:28:50 UTC
Cary, you are correct, I am using --compress-debug-sections here.
Comment 3 jim@meyering.net 2015-05-07 18:49:42 UTC
I have encountered this problem, too.
Is a patch coming soon, or should I pursue a workaround?

Thanks for tending binutils!
Comment 4 Cary Coutant 2015-05-08 16:48:56 UTC
> I have encountered this problem, too.
> Is a patch coming soon, or should I pursue a workaround?

I've been working on fixing this, but it's going to take some time.

Workarounds are to use a different --build-id option (anything but "tree"), or to turn off --compress-debug-sections. (You can always use objcopy to compress the debug sections after the link if you need that.)
Comment 5 jim@meyering.net 2015-05-08 16:57:46 UTC
Glad you're on it. Thanks for the info.
Comment 6 Sourceware Commits 2015-06-02 16:47:10 UTC
The master branch has been updated by Cary Coutant <ccoutant@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9c7fe3c5c2c9e4c3571c253cf77341e3f6adf94c

commit 9c7fe3c5c2c9e4c3571c253cf77341e3f6adf94c
Author: Cary Coutant <ccoutant@gmail.com>
Date:   Tue Jun 2 09:45:24 2015 -0700

    PR 17819: Fix --build-id=tree when using --compress-debug-sections.
    
    When --build-id=tree is selected, gold would schedule a set of
    tasks to run to compute md5 hashes in parallel on chunks of the
    file. The scheduling was done before the
    Write_after_input_sections_task ran, so if we are compressing
    debug sections, the output file will change size and be remapped
    to a new address, sometimes causing the build id computation to
    crash, but even when it doesn't crash, it wouldn't include the
    debug information in the hash computation.
    
    This patch delays the scheduling of the md5 tasks until after
    Write_after_input_sections_task.
    
    gold/
            PR gold/17819
            * gold.cc (queue_final_tasks): When --build-id=tree, queue a
            separate task to schedule the build id computation.
            * layout.cc (Hash_task::Hash_task): Remove build_id_blocker,
            add Output_file and offset.
            (Hash_task::run): Get and release the input views.
            (Hash_task::is_runnable): Always return NULL (always runnable).
            (Layout::queue_build_id_tasks): Remove.
            (Layout::write_build_id): Add array_of_hashes and size_of_hashes
            parameters; use them instead of class members.
            (Build_id_task_runner::run): New function.
            (Close_task_runner::run): Pass array_of_hashes and size_of_hashes
            to write_build_id.
            * layout.h (Layout::queue_build_id_tasks): Remove.
            (Layout::write_build_id): Add array_of_hashes and size_of_hashes
            parameters.
            (Layout::array_of_hashes_): Remove.
            (Layout::size_of_array_of_hashes_): Remove.
            (Layout::input_view_): Remove.
            (Build_id_task_runner): New class.
            (Close_task_runner::Close_task_runner): Add array_of_hashes and
            size_of_hashes parameters.
            (Close_task_runner::array_of_hashes_): New data member.
            (Close_task_runner::size_of_hashes_): New data member.
            * testsuite/Makefile.am
            (flagstest_compress_debug_sections_and_build_id_tree): New test.
            * testsuite/Makefile.in: Regenerate.
Comment 7 Cary Coutant 2015-06-02 16:48:11 UTC
Fixed on trunk.