19828 – 7.11 regression: non-stop gdb -p <process from a container>: internal error

Bug 19828 - 7.11 regression: non-stop gdb -p <process from a container>: internal error

Summary: 7.11 regression: non-stop gdb -p <process from a container>: internal error

Status:	RESOLVED FIXED

Alias:	None

Product:	gdb
Classification:	Unclassified
Component:	gdb (show other bugs)
Version:	7.11

Importance:	P2 normal
Target Milestone:	7.11.1
Assignee:	Pedro Alves

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-03-15 20:31 UTC by Jan Kratochvil
Modified:	2021-11-09 08:34 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
threadit.c (194 bytes, text/plain) 2016-03-15 20:31 UTC, Jan Kratochvil	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jan Kratochvil 2016-03-15 20:31:31 UTC

Created attachment 9097 [details]
threadit.c

398e081380a204e3b9fb4eb4da069ccf471f930e is the first bad commit
commit 398e081380a204e3b9fb4eb4da069ccf471f930e
Author: Pedro Alves <palves@redhat.com>
Date:   Wed Sep 30 19:23:39 2015 +0100
    x86/Linux: reenable all-stop on top of non-stop

# docker run -ti -v /root:/root docker.io/centos bash
Inside docker:
# /root/threadit

Outside docker:
# gdb -p `pidof threadit`
warning: Expected absolute pathname for libpthread in the inferior, but got target:/lib64/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
Reading symbols from target:/lib64/libc.so.6...(no debugging symbols found)...done.
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable
warning: Expected absolute pathname for libpthread in the inferior, but got target:/lib64/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
0x00007fdc1026cef7 in pthread_join () from target:/lib64/libpthread.so.0
(gdb) q
A debugging session is active.
	Inferior 1 [process 7483] will be detached.
Quit anyway? (y or n) y
thread.c:980: internal-error: is_executing: Assertion `tp' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) _

Bug #1: The internal error should not happen.
Bug #2: One cannot entry anything ('y' or 'n'), GDB does not respond.

I am aware this is not a correct way to attach to containerized multi-threaded processes but it can happen by a mistake.

It is not reproducible with non-threaded process.

Attaching a sample multi-threaded process I used for reproducibility.
I guess any multithreaded process would suffice.

Comment 1 Sourceware Commits 2016-04-12 16:12:36 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=651ce16aa7c2bd5e9f634e91e73790dc3e01a5c0

commit 651ce16aa7c2bd5e9f634e91e73790dc3e01a5c0
Author: Pedro Alves <palves@redhat.com>
Date:   Tue Apr 12 16:49:32 2016 +0100

    Do target_terminal_ours in query & friends instead of in all callers
    
    Any time a caller calls query & friends / prompt_for_continue without
    ensuring that gdb owns the terminal for input is a bug.  So do that in
    defaulted_query / prompt_for_continue directly instead.
    
    An example of a case where we currently miss calling
    target_terminal_ours is internal_error.  Ever since defaulted_query
    was made to use gdb_readline_callback, there's no way to answer the
    internal error query if the internal error happens while the target is
    has the terminal:
    
      (gdb) c
      Continuing.
      .../src/gdb/linux-nat.c:1676: internal-error: linux_nat_resume: Assertion `dummy_counter < 10' failed.
      A problem internal to GDB has been detected,
      further debugging may prove unreliable.
      Quit this debugging session? (y or n) _
    
    Entering 'y' or 'n' does not work, GDB does not respond.
    
    gdb/ChangeLog:
    2016-04-12  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* gnu-nat.c (inf_validate_task_sc): Don't call
    	target_terminal_ours / target_terminal_inferior around query.
    	* i386-tdep.c (i386_record_lea_modrm, i386_process_record): Don't
    	call target_terminal_ours / target_terminal_inferior around
    	yquery.
    	* linux-record.c (record_linux_system_call): Don't call
    	target_terminal_ours / target_terminal_inferior around yquery.
    	* nto-procfs.c (interrupt_query): Don't call target_terminal_ours
    	/ target_terminal_inferior around query.
    	* record-full.c (record_full_check_insn_num): Remove
    	'set_terminal' parameter.  Don't call target_terminal_ours /
    	target_terminal_inferior around query.
    	(record_full_message, record_full_registers_change)
    	(record_full_xfer_partial): Adjust.
    	* remote.c (interrupt_query): Don't call target_terminal_ours /
    	target_terminal_inferior around query.
    	* utils.c (defaulted_query): Install cleanup to restore target
    	terminal.  Put target_terminal_ours_for_output in effect while
    	defaulted producing, and target_terminal_ours in in effect while
    	handling input.
    	(prompt_for_continue): Install cleanup to restore target terminal.
    	Put target_terminal_ours in in effect while handling input.

Comment 2 Pedro Alves 2016-05-19 15:53:59 UTC

For master:

 [PATCH 0/6] Fix PR gdb/19828 (attach -> internal error) and attach optimizations
 https://sourceware.org/ml/gdb-patches/2016-05/msg00335.html

For 7.11.1, maybe:

 [PATCH/7.11.1?] Simpler fix PR gdb/19828: gdb -p <process from a container>: internal error
 https://sourceware.org/ml/gdb-patches/2016-05/msg00335.html

Comment 3 Sourceware Commits 2016-05-24 13:55:25 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=aa01bd3689d204ce3d657cf7eb17b8343d79a080

commit aa01bd3689d204ce3d657cf7eb17b8343d79a080
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:56 2016 +0100

    Linux native thread create/exit events support
    
    A following patch (fix for gdb/19828) makes linux-nat.c add threads to
    GDB's thread list earlier in the "attach" sequence, and that causes a
    surprising regression on
    gdb.threads/attach-many-short-lived-threads.exp on my machine.  The
    extra "thread x exited" handling and traffic slows down that test
    enough that GDB core has trouble keeping up with new threads that are
    spawned while trying to stop existing ones.
    
    I saw the exact same issue with remote/gdbserver a while ago and fixed
    it in 65706a29bac5 (Remote thread create/exit events) so part of the
    fix here is the exact same -- add support for thread created events to
    gdb/linux-nat.c.  infrun.c:stop_all_threads enables those events when
    it tries to stop threads, which ensures that new threads never get a
    chance to themselves start new threads, thus fixing the race.
    
    gdb/
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (report_thread_events): New global.
    	(linux_handle_extended_wait): Report
    	TARGET_WAITKIND_THREAD_CREATED if thread event reporting is
    	enabled.
    	(wait_lwp, linux_nat_filter_event): Report all thread exits if
    	thread event reporting is enabled.  Remove comment.
    	(filter_exit_event): New function.
    	(linux_nat_wait_1): Use it.
    	(linux_nat_thread_events): New function.
    	(linux_nat_add_target): Install it as target_thread_events method.

Comment 4 Sourceware Commits 2016-05-24 13:55:31 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=95e94c3f18aaf34fadcd9a2a882ffe6147b9acc3

commit 95e94c3f18aaf34fadcd9a2a882ffe6147b9acc3
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:56 2016 +0100

    [Linux] Read vDSO range from /proc/PID/task/PID/maps instead of /proc/PID/maps
    
    ... as it's _much_ faster.
    
    Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
    spawn thousands of threads instead of dozens to stress and debug
    timeout problems with gdb.threads/attach-many-short-lived-threads.exp,
    I saw that GDB would spend several seconds just reading the
    /proc/PID/smaps file, to determine the vDSO mapping range.  GDB opens
    and reads the whole file just once, and caches the result, but even
    that is too slow.  For example, with almost 8000 threads:
    
     $ ls /proc/3518/task/ | wc -l
     7906
    
    reading the /proc/PID/smaps file grepping for "vdso" takes over 15
    seconds :
    
     $ time cat /proc/3518/smaps | grep vdso
     7ffdbafee000-7ffdbaff0000 r-xp 00000000 00:00 0                          [vdso]
    
     real    0m15.371s
     user    0m0.008s
     sys     0m15.017s
    
    Looking around the web for hints, I found a nice description of the
    issue here:
    
     http://backtrace.io/blog/blog/2014/11/12/large-thread-counts-and-slow-process-maps/
    
    The problem is that /proc/PID/smaps wants to show the mappings as
    being thread stack, and that has the kernel iterating over all threads
    in the thread group, for each mapping.
    
    The fix is to use the "map" file under /proc/PID/task/PID/ instead of
    the /proc/PID/ one, as the former doesn't mark thread stacks for all
    threads.
    
    That alone drops the timing to the millisecond range on my machine:
    
     $ time cat /proc/3518/task/3518/smaps | grep vdso
     7ffdbafee000-7ffdbaff0000 r-xp 00000000 00:00 0                          [vdso]
    
     real    0m0.150s
     user    0m0.009s
     sys     0m0.084s
    
    And since we only need the vdso mapping's address range, we can use
    "maps" file instead of "smaps", and it's even cheaper:
    
    /proc/PID/task/PID/maps :
    
     $ time cat /proc/3518/task/3518/maps | grep vdso
     7ffdbafee000-7ffdbaff0000 r-xp 00000000 00:00 0                          [vdso]
    
     real    0m0.027s
     user    0m0.000s
     sys     0m0.017s
    
    gdb/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-tdep.c (find_mapping_size): Delete.
    	(linux_vsyscall_range_raw): Rewrite reading from
    	/proc/PID/task/PID/maps directly instead of using
    	gdbarch_find_memory_regions.

Comment 5 Sourceware Commits 2016-05-24 13:55:36 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1ad3de988d2f41c72de66613c68ed78507a3abbd

commit 1ad3de988d2f41c72de66613c68ed78507a3abbd
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:57 2016 +0100

    [Linux] Avoid refetching core-of-thread if thread hasn't run
    
    Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
    spawn thousands of threads instead of dozens, I saw GDB having trouble
    keeping up with threads being spawned too fast, when it tried to stop
    them all.  This was because while gdb is doing that, it updates the
    thread list to make sure no new thread has sneaked in that might need
    to be paused.  It does this a few times until it sees no-new-threads
    twice in a row.  The thread listing update itself is not that
    expensive, however, in the Linux backend, updating the threads list
    calls linux_common_core_of_thread for each LWP to record on which core
    each LWP was last seen running, which opens/reads/closes a /proc file
    for each LWP which becomes expensive when you need to do it for
    thousands of LWPs.
    
    perf shows gdb in linux_common_core_of_thread 44% of the time, in the
    stop_all_threads -> update_thread_list path in this use case.
    
    This patch simply makes linux_common_core_of_thread avoid updating the
    core the thread is bound to if the thread hasn't run since the last
    time we updated that info.  This makes linux_common_core_of_thread
    disappear into the noise in the perf report.
    
    gdb/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (linux_resume_one_lwp_throw): Clear the LWP's core
    	field.
    	(linux_nat_update_thread_list): Don't fetch the core if already
    	known.

Comment 6 Sourceware Commits 2016-05-24 13:55:42 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=774113b02f41ded4d9ba4d18571ee5024312ad1b

commit 774113b02f41ded4d9ba4d18571ee5024312ad1b
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:57 2016 +0100

    [Linux] Optimize PID -> struct lwp_info lookup
    
    Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
    spawn thousands of threads instead of dozens, and running gdb under
    perf, I saw that GDB was spending most of the time in find_lwp_pid:
    
       - captured_main
          - 93.61% catch_command_errors
             - 87.41% attach_command
                - 87.40% linux_nat_attach
                   - 87.40% linux_proc_attach_tgid_threads
                      - 82.38% attach_proc_task_lwp_callback
                         - 81.01% find_lwp_pid
                              5.30% ptid_get_lwp
                            + 0.10% ptid_lwp_p
                         + 0.64% add_thread
                         + 0.26% set_running
                         + 0.24% set_executing
                           0.12% ptid_get_lwp
                         + 0.01% ptrace
                         + 0.01% add_lwp
    
    attach_proc_task_lwp_callback is called once for each LWP that we
    attach to, found by listing the /proc/PID/task/ directory.  In turn,
    attach_proc_task_lwp_callback calls find_lwp_pid to check whether the
    LWP we're about to try to attach to is already known.  Since
    find_lwp_pid does a linear walk over the whole LWP list, this becomes
    quadratic.  We do the /proc/PID/task/ listing until we get two
    iterations in a row where we found no new threads.  So the second and
    following times we walk the /proc/PID/task/ dir, we're going to take
    an even worse find_lwp_pid hit.
    
    Fix this by adding a hash table keyed by LWP PID, for fast lookup.
    
    The linked list embedded in the LWP structure itself is kept, and made
    a double-linked list, so that removals from that list are O(1).  An
    earlier version of this patch got rid of this list altogether, but
    that revealed hidden dependencies / assumptions on how the list is
    sorted.  For example, killing a process and then waiting for all the
    LWPs status using iterate_over_lwps only works as is because the
    leader LWP is always last in the list.  So I thought it better to take
    an incremental approach and make this patch concern itself _only_ with
    the PID lookup optimization.
    
    gdb/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (lwp_lwpid_htab): New htab.
    	(lwp_info_hash, lwp_lwpid_htab_eq, lwp_lwpid_htab_create)
    	(lwp_lwpid_htab_add_lwp): New functions.
    	(lwp_list): Tweak comment.
    	(lwp_list_add, lwp_list_remove, lwp_lwpid_htab_remove_pid): New
    	functions.
    	(purge_lwp_list): Rewrite, using htab_traverse_noresize.
    	(add_initial_lwp): Add lwp to htab too.  Use lwp_list_add.
    	(delete_lwp): Use lwp_list_remove.  Remove htab too.
    	(find_lwp_pid): Search in htab.
    	(_initialize_linux_nat): Call lwp_lwpid_htab_create.
    	* linux-nat.h (struct lwp_info) <prev>: New field.

Comment 7 Sourceware Commits 2016-05-24 13:55:47 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=72b049d38ce85c51fc9f97ee64b00a47be5ebe94

commit 72b049d38ce85c51fc9f97ee64b00a47be5ebe94
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:57 2016 +0100

    Make gdb/linux-nat.c consider a waitstatus pending on the infrun side
    
    Working on the fix for gdb/19828, I saw
    gdb.threads/attach-many-short-lived-threads.exp fail once in an
    unusual way.  Unfortunately I didn't keep debug logs, but it's an
    issue similar to what's been fixed in remote.c a while ago --
    linux-nat.c was not fetching the pending status from the right place.
    
    gdb/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (get_pending_status): If the thread reported the
    	event to the core and it's pending, use the pending status signal
    	number.

Comment 8 Sourceware Commits 2016-05-24 13:55:53 UTC

The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=026a91747567565bf2956fae98fed6a958151aab

commit 026a91747567565bf2956fae98fed6a958151aab
Author: Pedro Alves <palves@redhat.com>
Date:   Tue May 24 14:47:57 2016 +0100

    Fix PR gdb/19828: gdb -p <process from a container>: internal error
    
    When GDB attaches to a process, it looks at the /proc/PID/task/ dir
    for all clone threads of that process, and attaches to each of them.
    
    Usually, if there is more than one clone thread, it means the program
    is multi threaded and linked with pthreads.  Thus when GDB soon after
    attaching finds and loads a libthread_db matching the process, it'll
    add a thread to the thread list for each of the initially found
    lower-level LWPs.
    
    If, however, GDB fails to find/load a matching libthread_db, nothing
    is adding the LWPs to the thread list.  And because of that, "detach"
    hits an internal error:
    
      (gdb) PASS: gdb.threads/clone-attach-detach.exp: fg attach 1: attach
      info threads
        Id   Target Id         Frame
      * 1    LWP 6891 "clone-attach-de" 0x00007f87e5fd0790 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
      (gdb) FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: info threads shows two LWPs
      detach
      .../src/gdb/thread.c:1010: internal-error: is_executing: Assertion `tp' failed.
      A problem internal to GDB has been detected,
      further debugging may prove unreliable.
      Quit this debugging session? (y or n)
      FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: detach (GDB internal error)
    
    From here:
    
      ...
      #8  0x00000000007ba7cc in internal_error (file=0x98ea68 ".../src/gdb/thread.c", line=1010, fmt=0x98ea30 "%s: Assertion `%s' failed.")
          at .../src/gdb/common/errors.c:55
      #9  0x000000000064bb83 in is_executing (ptid=...) at .../src/gdb/thread.c:1010
      #10 0x00000000004c23bb in get_pending_status (lp=0x12c5cc0, status=0x7fffffffdc0c) at .../src/gdb/linux-nat.c:1235
      #11 0x00000000004c2738 in detach_callback (lp=0x12c5cc0, data=0x0) at .../src/gdb/linux-nat.c:1317
      #12 0x00000000004c1a2a in iterate_over_lwps (filter=..., callback=0x4c2599 <detach_callback>, data=0x0) at .../src/gdb/linux-nat.c:899
      #13 0x00000000004c295c in linux_nat_detach (ops=0xe7bd30, args=0x0, from_tty=1) at .../src/gdb/linux-nat.c:1358
      #14 0x000000000068284d in delegate_detach (self=0xe7bd30, arg1=0x0, arg2=1) at .../src/gdb/target-delegates.c:34
      #15 0x0000000000694141 in target_detach (args=0x0, from_tty=1) at .../src/gdb/target.c:2241
      #16 0x0000000000630582 in detach_command (args=0x0, from_tty=1) at .../src/gdb/infcmd.c:2975
      ...
    
    Tested on x86-64 Fedora 23.  Also confirmed the test passes against
    gdbserver with "maint set target-non-stop".
    
    gdb/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (attach_proc_task_lwp_callback): Mark the lwp
    	resumed, and add the thread to GDB's thread list.
    
    testsuite/ChangeLog:
    2016-05-24  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* gdb.threads/clone-attach-detach.c: New file.
    	* gdb.threads/clone-attach-detach.exp: New file.

Comment 9 Sourceware Commits 2016-05-25 17:36:39 UTC

The gdb-7.11-branch branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a0de87e7be6a58dfeb9bfb00172dbd975dabb72e

commit a0de87e7be6a58dfeb9bfb00172dbd975dabb72e
Author: Pedro Alves <palves@redhat.com>
Date:   Wed May 25 18:35:09 2016 +0100

    Make gdb/linux-nat.c consider a waitstatus pending on the infrun side
    
    Working on the fix for gdb/19828, I saw
    gdb.threads/attach-many-short-lived-threads.exp fail once in an
    unusual way.  Unfortunately I didn't keep debug logs, but it's an
    issue similar to what's been fixed in remote.c a while ago --
    linux-nat.c was not fetching the pending status from the right place.
    
    gdb/ChangeLog:
    2016-05-25  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (get_pending_status): If the thread reported the
    	event to the core and it's pending, use the pending status signal
    	number.

Comment 10 Sourceware Commits 2016-05-25 17:36:44 UTC

The gdb-7.11-branch branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=136613ef0c6850427317e57be1b644080ff6decb

commit 136613ef0c6850427317e57be1b644080ff6decb
Author: Pedro Alves <palves@redhat.com>
Date:   Wed May 25 18:35:09 2016 +0100

    Fix PR gdb/19828: gdb -p <process from a container>: internal error
    
    When GDB attaches to a process, it looks at the /proc/PID/task/ dir
    for all clone threads of that process, and attaches to each of them.
    
    Usually, if there is more than one clone thread, it means the program
    is multi threaded and linked with pthreads.  Thus when GDB soon after
    attaching finds and loads a libthread_db matching the process, it'll
    add a thread to the thread list for each of the initially found
    lower-level LWPs.
    
    If, however, GDB fails to find/load a matching libthread_db, nothing
    is adding the LWPs to the thread list.  And because of that, "detach"
    hits an internal error:
    
      (gdb) PASS: gdb.threads/clone-attach-detach.exp: fg attach 1: attach
      info threads
        Id   Target Id         Frame
      * 1    LWP 6891 "clone-attach-de" 0x00007f87e5fd0790 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
      (gdb) FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: info threads shows two LWPs
      detach
      .../src/gdb/thread.c:1010: internal-error: is_executing: Assertion `tp' failed.
      A problem internal to GDB has been detected,
      further debugging may prove unreliable.
      Quit this debugging session? (y or n)
      FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: detach (GDB internal error)
    
    From here:
    
      ...
      #8  0x00000000007ba7cc in internal_error (file=0x98ea68 ".../src/gdb/thread.c", line=1010, fmt=0x98ea30 "%s: Assertion `%s' failed.")
          at .../src/gdb/common/errors.c:55
      #9  0x000000000064bb83 in is_executing (ptid=...) at .../src/gdb/thread.c:1010
      #10 0x00000000004c23bb in get_pending_status (lp=0x12c5cc0, status=0x7fffffffdc0c) at .../src/gdb/linux-nat.c:1235
      #11 0x00000000004c2738 in detach_callback (lp=0x12c5cc0, data=0x0) at .../src/gdb/linux-nat.c:1317
      #12 0x00000000004c1a2a in iterate_over_lwps (filter=..., callback=0x4c2599 <detach_callback>, data=0x0) at .../src/gdb/linux-nat.c:899
      #13 0x00000000004c295c in linux_nat_detach (ops=0xe7bd30, args=0x0, from_tty=1) at .../src/gdb/linux-nat.c:1358
      #14 0x000000000068284d in delegate_detach (self=0xe7bd30, arg1=0x0, arg2=1) at .../src/gdb/target-delegates.c:34
      #15 0x0000000000694141 in target_detach (args=0x0, from_tty=1) at .../src/gdb/target.c:2241
      #16 0x0000000000630582 in detach_command (args=0x0, from_tty=1) at .../src/gdb/infcmd.c:2975
      ...
    
    Tested on x86-64 Fedora 23.  Also confirmed the test passes against
    gdbserver with "maint set target-non-stop".
    
    Unfortunately, making GDB add LWPs to the thread list sooner exposes
    inefficiencies that in turn result in
    gdb.threads/attach-many-short-lived-threads.exp timing out frequently.
    Since that testcase is really a contrived use case designed to stress
    some aspects of attach/detach and thread listing, not really
    representative of real programs, this commit disables the test.
    
    gdb/ChangeLog:
    2016-05-25  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* linux-nat.c (attach_proc_task_lwp_callback): Mark the lwp
    	resumed, and add the thread to GDB's thread list.
    
    testsuite/ChangeLog:
    2016-05-25  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/19828
    	* gdb.threads/clone-attach-detach.c: New file.
    	* gdb.threads/clone-attach-detach.exp: New file.
    	* gdb.threads/attach-many-short-lived-threads.exp: Skip.

Comment 11 Pedro Alves 2016-05-25 17:37:19 UTC

Fixed, master and 7.11.1.

Comment 12 Sourceware Commits 2016-09-29 15:41:15 UTC

The master branch has been updated by Jan Kratochvil <jkratoch@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bb805577d2b212411fb7b0a2d01644567fac4e8d

commit bb805577d2b212411fb7b0a2d01644567fac4e8d
Author: Jan Kratochvil <jan.kratochvil@redhat.com>
Date:   Thu Sep 29 17:38:16 2016 +0200

    PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    
    Regression: gdb --pid $(pidof qemu-system-x86_64) stopped working with gdb 7.11.1
    https://sourceware.org/bugzilla/show_bug.cgi?id=20609
    
    It was reported for qemu-system-x86_64 but it happens for any multithreaded
    inferior with a JIT debugging hook.
    
    136613ef0c6850427317e57be1b644080ff6decb is the first bad commit
    Author: Pedro Alves <palves@redhat.com>
        Fix PR gdb/19828: gdb -p <process from a container>: internal error
    Message-ID: <cbdf2e04-4fa8-872a-2a23-08c9c1b26e00@redhat.com>
    https://sourceware.org/ml/gdb-patches/2016-05/msg00450.html
    
    jit_breakpoint_re_set() is specific by trying to insert a breakpoint into the
    main executable, not into a shared library.  During attachment GDB thinks it
    needs to use 'breakpoint always-inserted' from
    breakpoints_should_be_inserted_now() as a newly attached thread is
    'thread_info->executing' due to 'lwp_info->must_set_ptrace_flags' enabled and
    the task not yet stopped.  This did not happen before the 'bad commit' above
    which adds tracking of such thread.
    
    GDB then fails to insert the breakpoints to invalid address as PIE executable
    gets properly relocated during later phase of attachment.  One can see in the
    backtraces below:
     -> jit_breakpoint_re_set_internal()
    later:
     -> svr4_exec_displacement()
    
    One can suppress the initial breakpoint_re_set() call as there will be another
    breakpoint_re_set() done from the final post_create_inferior() call in
    setup_inferior().
    
    BTW additionally 'threads_executing' cache bool is somehow stale (somewhere is
    missing update_threads_executing()).  I was trying to deal with that in my
    first/second attempt below but in my final third attempt (attached) I have
    left it as it is.
    
    First attempt trying not to falsely require 'breakpoint always-inserted':
      https://people.redhat.com/jkratoch/rhbz1375553-fix1.patch
    Reduced first attempt:
      https://people.redhat.com/jkratoch/rhbz1375553-fix2.patch
    
    The third attempt suppresses breakpoint insertion until PIE executable gets
    relocated by svr4_exec_displacement().  Applied.
    
    gdb/ChangeLog
    2016-09-29  Jan Kratochvil  <jan.kratochvil@redhat.com>
    
    	PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    	* exec.c (exec_file_locate_attach): Add parameter defer_bp_reset.
    	Use it.
    	* gdbcore.h (exec_file_locate_attach): Add parameter defer_bp_reset.
    	* infcmd.c (setup_inferior): Update caller.
    	* remote.c (remote_add_inferior): Likewise.
    
    gdb/testsuite/ChangeLog
    2016-09-29  Jan Kratochvil  <jan.kratochvil@redhat.com>
    
    	PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    	* gdb.base/jit-attach-pie.c: New file.
    	* gdb.base/jit-attach-pie.exp: New file.

Comment 13 Sourceware Commits 2016-09-29 15:44:17 UTC

The gdb-7.12-branch branch has been updated by Jan Kratochvil <jkratoch@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b31567caa5d7ed8e9ad69f59a562c0494c0b3cbe

commit b31567caa5d7ed8e9ad69f59a562c0494c0b3cbe
Author: Jan Kratochvil <jan.kratochvil@redhat.com>
Date:   Thu Sep 29 17:38:16 2016 +0200

    PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    
    Regression: gdb --pid $(pidof qemu-system-x86_64) stopped working with gdb 7.11.1
    https://sourceware.org/bugzilla/show_bug.cgi?id=20609
    
    It was reported for qemu-system-x86_64 but it happens for any multithreaded
    inferior with a JIT debugging hook.
    
    136613ef0c6850427317e57be1b644080ff6decb is the first bad commit
    Author: Pedro Alves <palves@redhat.com>
        Fix PR gdb/19828: gdb -p <process from a container>: internal error
    Message-ID: <cbdf2e04-4fa8-872a-2a23-08c9c1b26e00@redhat.com>
    https://sourceware.org/ml/gdb-patches/2016-05/msg00450.html
    
    jit_breakpoint_re_set() is specific by trying to insert a breakpoint into the
    main executable, not into a shared library.  During attachment GDB thinks it
    needs to use 'breakpoint always-inserted' from
    breakpoints_should_be_inserted_now() as a newly attached thread is
    'thread_info->executing' due to 'lwp_info->must_set_ptrace_flags' enabled and
    the task not yet stopped.  This did not happen before the 'bad commit' above
    which adds tracking of such thread.
    
    GDB then fails to insert the breakpoints to invalid address as PIE executable
    gets properly relocated during later phase of attachment.  One can see in the
    backtraces below:
     -> jit_breakpoint_re_set_internal()
    later:
     -> svr4_exec_displacement()
    
    One can suppress the initial breakpoint_re_set() call as there will be another
    breakpoint_re_set() done from the final post_create_inferior() call in
    setup_inferior().
    
    BTW additionally 'threads_executing' cache bool is somehow stale (somewhere is
    missing update_threads_executing()).  I was trying to deal with that in my
    first/second attempt below but in my final third attempt (attached) I have
    left it as it is.
    
    First attempt trying not to falsely require 'breakpoint always-inserted':
      https://people.redhat.com/jkratoch/rhbz1375553-fix1.patch
    Reduced first attempt:
      https://people.redhat.com/jkratoch/rhbz1375553-fix2.patch
    
    The third attempt suppresses breakpoint insertion until PIE executable gets
    relocated by svr4_exec_displacement().  Applied.
    
    gdb/ChangeLog
    2016-09-29  Jan Kratochvil  <jan.kratochvil@redhat.com>
    
    	PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    	* exec.c (exec_file_locate_attach): Add parameter defer_bp_reset.
    	Use it.
    	* gdbcore.h (exec_file_locate_attach): Add parameter defer_bp_reset.
    	* infcmd.c (setup_inferior): Update caller.
    	* remote.c (remote_add_inferior): Likewise.
    
    gdb/testsuite/ChangeLog
    2016-09-29  Jan Kratochvil  <jan.kratochvil@redhat.com>
    
    	PR gdb/20609 - attach of JIT-debug-enabled inf 7.11.1 regression
    	* gdb.base/jit-attach-pie.c: New file.
    	* gdb.base/jit-attach-pie.exp: New file.

Comment 14 Joey Kim 2021-11-08 08:13:27 UTC Comment hidden (spam)

Unfortunately, making GDB add LWPs to the thread list sooner exposes
    inefficiencies that in turn result in
    gdb.threads/attach-many-short-lived-threads.exp timing out frequently.
    Since that testcase is really a contrived use case designed to stress
    some aspects of attach/detach and thread listing, not really
    representative of real programs, this commit disables the test.
    https://worcesterroofingandsiding.com

Comment 15 Stewart 2021-11-09 08:34:56 UTC Comment hidden (spam)

Reading symbols from target:/lib64/libc.so.6...(no debugging symbols found)...done.
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
https://rochesterconcretesolutions.com