Bug 29942 - Read some DWARF purely in background
Summary: Read some DWARF purely in background
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 15.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks: 29366
  Show dependency treegraph
 
Reported: 2022-12-25 21:21 UTC by Tom Tromey
Modified: 2024-01-09 01:48 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Tromey 2022-12-25 21:21:41 UTC
Currently, dwarf2_build_psymtabs_hard works synchronously: it sends
some work to worker threads, but it waits for all work to be
completed before proceeding.

However, it doesn't have to work this way.  Instead, it could start
reading immediately; and then after starting the background tasks,
it could store a future that would then be waited on by
cooked_index_functions::read_partial_symbols.

I think the main benefit to doing this would be that "attach"
might appear to be faster.  In particular work like downloading
from debuginfod could be done while other threads are scanning
the DWARF.
Comment 1 Tom Tromey 2022-12-25 21:22:00 UTC
FWIW I have a WIP patch for this.
Comment 2 Tom Tromey 2022-12-26 05:53:36 UTC
It occurs to me that maybe everything in dwarf2_build_psymtabs_hard
after the sections are read could be done in a worker thread.
(This kind of thing would maybe be easier with C++20 coroutines.)

> I think the main benefit to doing this would be that "attach"
> might appear to be faster.

The theory behind this is that with "attach", breakpoint re-sets
are deferred until all the shared libraries are read, providing
a little extra time for the background threads to work.

It might be interesting to see if minsym reading can also be
done in a worker.
Comment 3 Tom Tromey 2022-12-27 16:30:31 UTC
(In reply to Tom Tromey from comment #2)
> It occurs to me that maybe everything in dwarf2_build_psymtabs_hard
> after the sections are read could be done in a worker thread.

I haven't tested 'attach' yet, but this doesn't improve things
in the ordinary 'file' case, mainly I think due to the handling
of 'main'.
Comment 4 Tom Tromey 2023-08-13 15:37:13 UTC
I had a patch to add some thread-safety to BFD on this branch,
but I belatedly realized it can't really work, due to the way
the BFD fd cache works -- we'd have to introduce some way to
lock an fd while a function is using it.

Instead I'm going to try external locking and avoid modifying BFD at all.
Comment 5 Tom Tromey 2023-08-18 17:49:18 UTC
I also looked at wrapping the cache iovec in a locked variant.
However, this code in bfd_cache_close prevents that approach:

  /* Don't remove this test.  bfd_reinit depends on it.  */
  if (abfd->iovec != &cache_iovec)
    return true;

I think I'll just end up wrapping all "suspect" BFD calls in
a global lock.  This may not be too bad, especially if we switch
back to eager mapping of DWARF sections.
Comment 6 Tom Tromey 2023-08-23 20:02:17 UTC
(In reply to Tom Tromey from comment #5)
> I also looked at wrapping the cache iovec in a locked variant.
> However, this code in bfd_cache_close prevents that approach:
> 
>   /* Don't remove this test.  bfd_reinit depends on it.  */
>   if (abfd->iovec != &cache_iovec)
>     return true;
> 
> I think I'll just end up wrapping all "suspect" BFD calls in
> a global lock.  This may not be too bad, especially if we switch
> back to eager mapping of DWARF sections.

There are 135 separate bfd_* functions called by gdb.
So, auditing this is a pain and probably error-prone.
And then some subset have to be wrapped in a lock, including
any that set the BFD error.
Coming back around to maybe trying to make the BFD error be thread-local.
Comment 7 Tom Tromey 2023-10-04 12:47:09 UTC
I've been working on this.
Comment 9 Sourceware Commits 2024-01-09 01:46:54 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=33c6eaaefcedd45e86d564d014f14cce2620f933

commit 33c6eaaefcedd45e86d564d014f14cce2620f933
Author: Tom Tromey <tom@tromey.com>
Date:   Fri Mar 24 23:35:02 2023 -0600

    Do more DWARF reading in the background
    
    This patch rearranges the DWARF reader so that more work is done in
    the background.  This is PR symtab/29942.
    
    The idea here is that there is only a small amount of work that must
    be done on the main thread when scanning DWARF -- before the main
    scan, the only part is mapping the section data.
    
    Currently, the DWARF reader uses the quick_symbol_functions "lazy"
    functionality to defer even starting to read.  This patch instead
    changes the reader to start reading immediately, but doing more in
    worker tasks.
    
    Before this patch, "file" on my machine:
    
        (gdb) file /tmp/gdb
        2023-10-23 12:29:56.885 - command started
        Reading symbols from /tmp/gdb...
        2023-10-23 12:29:58.047 - command finished
        Command execution time: 5.867228 (cpu), 1.162444 (wall)
    
    After the patch, more work is done in the background and so this takes
    a bit less time:
    
        (gdb) file /tmp/gdb
        2023-10-23 13:25:51.391 - command started
        Reading symbols from /tmp/gdb...
        2023-10-23 13:25:51.712 - command finished
        Command execution time: 1.894500 (cpu), 0.320306 (wall)
    
    I think this could be further sped up by using the shared library load
    map to avoid objfile loops like the one in expand_symtab_containing_pc
    -- it seems like the correct objfile could be chosen more directly.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29942
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30174
Comment 10 Tom Tromey 2024-01-09 01:48:08 UTC
Fixed.