Bug 26200

Summary: ld terminated with signal 11 with LTO and -Wl,--thread-count=2
Product: binutils Reporter: Martin Liska <martin.liska>
Component: goldAssignee: Cary Coutant <ccoutant>
Status: RESOLVED FIXED    
Severity: normal CC: alexandru.dascalu94, hjl.tools, ian, nick.gasson
Priority: P2    
Version: 2.34   
Target Milestone: 2.36   
Host: Target:
Build: Last reconfirmed: 2020-07-03 00:00:00
Bug Depends on: 26827    
Bug Blocks:    
Attachments: test-case

Description Martin Liska 2020-07-03 11:53:09 UTC
Created attachment 12683 [details]
test-case

$ g++ tc.ii -flto -c && g++ -shared -flto=auto -fPIC -fuse-ld=gold -Wl,--threads -Wl,--thread-count=2 -Wl,--whole-archive Overlay.o -Wl,--no-whole-archive
collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped
compilation terminated.

valgrind says:

==15730== For lists of detected and suppressed errors, rerun with: -s
==15730== ERROR SUMMARY: 11 errors from 11 contexts (suppressed: 0 from 0)
==15728== Invalid read of size 1
==15728==    at 0x435110: gold::Pluginobj::get_symbol_resolution_info(gold::Symbol_table*, int, ld_plugin_symbol*, int) const (plugin.cc:1295)
==15728==    by 0x485B4DC: UnknownInlinedFun (lto-plugin.c:549)
==15728==    by 0x485B4DC: all_symbols_read_handler (lto-plugin.c:731)
==15728==    by 0x4334EE: all_symbols_read (plugin.cc:403)
==15728==    by 0x4334EE: gold::Plugin_manager::all_symbols_read(gold::Workqueue*, gold::Task*, gold::Input_objects*, gold::Symbol_table*, gold::Dirsearch*, gold::Mapfile*, gold::Task_token**) (plugin.cc:856)
==15728==    by 0x43362B: gold::Plugin_hook::run(gold::Workqueue*) (plugin.cc:1770)
==15728==    by 0x48E4E7: gold::Workqueue::find_and_run_task(int) (workqueue.cc:319)
==15728==    by 0x48E599: gold::Workqueue::process(int) (workqueue.cc:495)
==15728==    by 0x17957B: main (main.cc:252)
==15728==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==15728== 
==15728== 
==15728== Process terminating with default action of signal 11 (SIGSEGV)
==15728==  Access not within mapped region at address 0x10
==15728==    at 0x435110: gold::Pluginobj::get_symbol_resolution_info(gold::Symbol_table*, int, ld_plugin_symbol*, int) const (plugin.cc:1295)
==15728==    by 0x485B4DC: UnknownInlinedFun (lto-plugin.c:549)
==15728==    by 0x485B4DC: all_symbols_read_handler (lto-plugin.c:731)
==15728==    by 0x4334EE: all_symbols_read (plugin.cc:403)
==15728==    by 0x4334EE: gold::Plugin_manager::all_symbols_read(gold::Workqueue*, gold::Task*, gold::Input_objects*, gold::Symbol_table*, gold::Dirsearch*, gold::Mapfile*, gold::Task_token**) (plugin.cc:856)
==15728==    by 0x43362B: gold::Plugin_hook::run(gold::Workqueue*) (plugin.cc:1770)
==15728==    by 0x48E4E7: gold::Workqueue::find_and_run_task(int) (workqueue.cc:319)
==15728==    by 0x48E599: gold::Workqueue::process(int) (workqueue.cc:495)
==15728==    by 0x17957B: main (main.cc:252)
==15728==  If you believe this happened as a result of a stack
==15728==  overflow in your program's main thread (unlikely but
==15728==  possible), you can try to increase the size of the
==15728==  main thread stack using the --main-stacksize= flag.
==15728==  The main thread stack size used in this run was 8388608.
Comment 1 H.J. Lu 2020-07-03 13:09:43 UTC
all_symbols_read_handler is called more than once by gold.
Comment 2 H.J. Lu 2020-07-03 13:20:32 UTC
(gdb) bt
#0  write_resolution ()
    at /export/gnu/import/git/sources/gcc-release/lto-plugin/lto-plugin.c:530
#1  0x00007ffff7fb3175 in all_symbols_read_handler ()
    at /export/gnu/import/git/sources/gcc-release/lto-plugin/lto-plugin.c:731
#2  0x00000000008502b9 in gold::Plugin::all_symbols_read (this=0xc041d0)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/plugin.cc:403
#3  0x000000000084cbb0 in gold::Plugin_manager::all_symbols_read (
    this=0xc08220, workqueue=0x7fffffff35e0, task=0xc58f50, 
    input_objects=0x7fffffff3570, symtab=0x7fffffff30e0, 
    dirpath=0x7fffffff2be0, mapfile=0x0, last_blocker=0xc58fb0)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/plugin.cc:856
#4  0x000000000084e377 in gold::Plugin_hook::run (this=0xc58f50, 
    workqueue=0x7fffffff35e0)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/plugin.cc:1770
#5  0x00000000008ed547 in gold::Workqueue::find_and_run_task (
    this=0x7fffffff35e0, thread_number=9)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/workqueue.cc:319
#6  0x00000000008edb74 in gold::Workqueue::process (this=0x7fffffff35e0, 
    thread_number=9)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/workqueue.cc:495
#7  0x00000000008ee337 in gold::Workqueue_threader_threadpool::process (
    thread_number=<optimized out>, this=<optimized out>)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/workqueue-internal.h:92
#8  gold::Workqueue_thread::thread_body (arg=0xc585b0)
    at /export/gnu/import/git/gitlab/x86-binutils/gold/workqueue-threads.cc:117
#9  0x00007ffff7c1a422 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff7b46943 in clone () from /lib64/libc.so.6
(gdb) 

There are

 for (this->current_ = this->plugins_.begin();
       this->current_ != this->plugins_.end();
       ++this->current_)
    (*this->current_)->all_symbols_read();

all_symbols_read_handler has

  write_resolution ();

  free_1 (claimed_files, num_claimed_files);

/* Free all memory that is no longer needed after writing the symbol
   resolution. */

static void
free_1 (struct plugin_file_info *files, unsigned num_files)
{

After all_symbols_read_handler is called the first time, all memory are
freed.  When all_symbols_read_handler is called the second time, ....
Comment 3 H.J. Lu 2020-07-03 14:04:31 UTC
class Plugin_manager has

  // A pointer to the current plugin.  Used while loading plugins.
  Plugin_list::iterator current_;

The same iterator is shared by all threads. It is OK to loading plugins.
since only one thread loads plugins.  But I don't quite understand how
it can work in other cases in more than one thread.
Comment 4 H.J. Lu 2020-07-03 14:16:52 UTC
This helps.  But it doesn't fix the problem.

diff --git a/gold/plugin.cc b/gold/plugin.cc
index 8963c7fa4c..f4ef613d43 100644
--- a/gold/plugin.cc
+++ b/gold/plugin.cc
@@ -755,15 +755,15 @@ Plugin_manager::claim_file(Input_file* input_file, off_t offset,
     this->objects_.push_back(elf_object);
   this->in_claim_file_handler_ = true;
 
-  for (this->current_ = this->plugins_.begin();
-       this->current_ != this->plugins_.end();
-       ++this->current_)
+  for (Plugin_list::iterator p = this->plugins_.begin();
+       p != this->plugins_.end();
+       ++p)
     {
       // If we aren't yet in replacement phase, allow plugins to claim input
       // files, otherwise notify the plugin of the new input file, if needed.
       if (!this->in_replacement_phase_)
         {
-          if ((*this->current_)->claim_file(&this->plugin_input_file_))
+          if ((*p)->claim_file(&this->plugin_input_file_))
             {
               this->any_claimed_ = true;
               this->in_claim_file_handler_ = false;
@@ -775,7 +775,7 @@ Plugin_manager::claim_file(Input_file* input_file, off_t offset,
 						: elf_object->name());
 		  this->recorder_->claimed_file(objname,
 						offset, filesize,
-						(*this->current_)->filename());
+						(*p)->filename());
 		}
 
               if (this->objects_.size() > handle
@@ -790,7 +790,7 @@ Plugin_manager::claim_file(Input_file* input_file, off_t offset,
         }
       else
         {
-          (*this->current_)->new_input(&this->plugin_input_file_);
+          (*p)->new_input(&this->plugin_input_file_);
         }
     }
 
@@ -850,10 +850,10 @@ Plugin_manager::all_symbols_read(Workqueue* workqueue, Task* task,
   layout->script_options()->set_defsym_uses_in_real_elf(symtab);
   layout->script_options()->find_defsym_defs(this->defsym_defines_set_);
 
-  for (this->current_ = this->plugins_.begin();
-       this->current_ != this->plugins_.end();
-       ++this->current_)
-    (*this->current_)->all_symbols_read();
+  for (Plugin_list::iterator p = this->plugins_.begin();
+       p != this->plugins_.end();
+       ++p)
+    (*p)->all_symbols_read();
 
   if (this->any_added_)
     {
@@ -1028,10 +1028,10 @@ Plugin_manager::cleanup()
       close_all_descriptors();
     }
 
-  for (this->current_ = this->plugins_.begin();
-       this->current_ != this->plugins_.end();
-       ++this->current_)
-    (*this->current_)->cleanup();
+  for (Plugin_list::iterator p = this->plugins_.begin();
+       p != this->plugins_.end();
+       ++p)
+    (*p)->cleanup();
 }
 
 // Make a new Pluginobj object.  This is called when the plugin calls
Comment 5 Martin Liska 2020-07-09 08:11:44 UTC
Any idea how to fix it H.J.
What about some locking mechanism?
Comment 6 H.J. Lu 2020-07-09 11:20:22 UTC
(In reply to Martin Liska from comment #5)
> Any idea how to fix it H.J.
> What about some locking mechanism?

I am not familiar with gold.
Comment 7 Nick Gasson 2020-11-05 09:52:46 UTC
(In reply to H.J. Lu from comment #4)
> This helps.  But it doesn't fix the problem.
> 

When you say it doesn't fix the problem, perhaps you were running into the --threads segfault in bug 26827? I tried latest master with your patch above and the attached test case doesn't crash any more.
Comment 8 Martin Liska 2020-11-05 10:11:59 UTC
(In reply to Nick Gasson from comment #7)
> (In reply to H.J. Lu from comment #4)
> > This helps.  But it doesn't fix the problem.
> > 
> 
> When you say it doesn't fix the problem, perhaps you were running into the
> --threads segfault in bug 26827? I tried latest master with your patch above
> and the attached test case doesn't crash any more.

It's a different problem related to a LTO plug-in. Please see H.J. brief analysis.
Comment 9 Nick Gasson 2020-11-05 12:38:28 UTC
(In reply to Martin Liska from comment #8)
> > 
> > When you say it doesn't fix the problem, perhaps you were running into the
> > --threads segfault in bug 26827? I tried latest master with your patch above
> > and the attached test case doesn't crash any more.
> 
> It's a different problem related to a LTO plug-in. Please see H.J. brief
> analysis.

I mean you need to apply both patches - the one from comment 4 and the one from bug 26827 - in order not to get a segfault. I tried your test case after applying both and it no longer crashes.
Comment 10 H.J. Lu 2020-11-05 14:08:58 UTC
A patch is posted at

https://sourceware.org/pipermail/binutils/2020-November/114021.html
Comment 11 Martin Liska 2020-11-05 14:16:21 UTC
Great, thank you both for the fix.
Comment 12 Sourceware Commits 2020-11-08 12:14:21 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d4820dac5e7608e24fba6d08cde9248b4c4b2928

commit d4820dac5e7608e24fba6d08cde9248b4c4b2928
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Nov 8 04:10:01 2020 -0800

    gold: Avoid sharing Plugin_list::iterator
    
    class Plugin_manager has
    
      // A pointer to the current plugin.  Used while loading plugins.
      Plugin_list::iterator current_;
    
    The same iterator is shared by all threads. It is OK to use it to load
    plugins since only one thread loads plugins.  Avoid sharing Plugin_list
    iterator in all other cases.
    
            PR gold/26200
            * plugin.cc (Plugin_manager::claim_file): Don't share Plugin_list
            iterator.
            (Plugin_manager::all_symbols_read): Likewise.
            (Plugin_manager::cleanup): Likewise.
Comment 13 H.J. Lu 2020-11-08 12:19:00 UTC
Fixed for 2.36.
Comment 14 H.J. Lu 2020-11-25 00:05:50 UTC
*** Bug 23607 has been marked as a duplicate of this bug. ***