Created attachment 7788 [details] Relevant GDB log. Hello, I am developing an homebrew gdb stub for ReactOS. Code can be found here: http://git.reactos.org/?p=reactos.git;a=tree;f=reactos/drivers/base/kdgdb As this is a kernel debugger, it is per se a multiprocess environment, and advertised as such at connection time with GDB. info thread triggers the following assert: thread.c:1002: internal-error: switch_to_thread: Assertion `inf != NULL' failed. Attached is the relevant log with "debug remote" set to 1 and the core dump at the time the failed assertion is raised. To reduce noise, I didn't load any symbol files when producing the log. (results are similar with them). GDB 7.8 was compiled from source with ./configure --target=i686-w64-mingw32 --with-expat I am ready and willing to give further details on this issue, as it hinders further use of GDB in this environment. Best regards Jérôme
Created attachment 7789 [details] Core dump at the time of the crash.
Created attachment 8149 [details] GDB backtrace Here is a relevant backtrace from the crashed GDB session. i686-w64-mingw32-gdb --version GNU gdb (GDB) 7.8 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=i686-w64-mingw32". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word".
Created attachment 8150 [details] GDB log triggering the crash
I suspect this is fixed in more recent GDB. Could you try mainline? Also, can you debug GDB a little? What was the ptid of the thread that GDB was expecting to find in the thread list, but isn't there? And why isn't it there anymore? The ptid is unfortunately not visible here (frame #6): ... #4 0x00000000006336b9 in internal_verror (file=<optimized out>, line=<optimized out>, fmt=<optimized out>, ap=ap@entry=0x7fff7cd84ac8) at utils.c:803 #5 0x0000000000633762 in internal_error (file=file@entry=0x771a62 "thread.c", line=line@entry=1002, string=<optimized out>) at utils.c:813 #6 0x000000000056a6df in switch_to_thread (ptid=...) at thread.c:1002 #7 0x000000000056b607 in print_thread_info (uiout=0x2233590, requested_threads=0x226f8e0 "1", pid=-1) at thread.c:928 #8 0x00000000004aa601 in mi_cmd_execute (parse=0x2228820) at ./mi/mi-main.c:2253 #9 captured_mi_execute_command (context=0x2228820, uiout=0x2233590) at ./mi/mi-main.c:1988 #10 mi_execute_command (cmd=0x22701c0 "23-thread-info 1", from_tty=<optimized out>) at ./mi/mi-main.c:2116 ... Please get a backtrace with "set print frame-arguments all".
Created attachment 8151 [details] GDB 7.9 crash log I tried with GDB 7.9, with debugging (-g3) enabled and disabled optimizations. There is still an assert failing, but quite different this time. I don't know if this is a different symptom of the same bug, or another one. Here is the log triggering the crash. I will try to compile mainline as soon as I can.
Created attachment 8152 [details] GDB 7.9 crash dump GDB 7.9 core analysis. Please let me know if you need some more info from a particular frame.
Created attachment 8153 [details] GDB 7.9 log triggering the crash Sorry, attached wrong file.
I've tried today's git (commit ID 96c20bc18d71ca5ae3335d48ff2b459d495032d3), the same problem occurs.
Thanks. I can reproduce this with GNU/Linux gdbserver. On one shell: $ gdbserver --multi :9999 a.out On another, connect with extended-remote, so we can start two processes under the same gdbserver: $ gdb a.out -ex "tar extended-remote :9999" ... GNU gdb (GDB) 7.9.50.20150318-cvs ... Reading symbols from ./a.out...done. Remote debugging using :9999 ... 0x0000003615a011f0 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) info inferiors Num Description Executable * 1 process 24970 a.out (gdb) add-inferior Added inferior 2 (gdb) inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] (gdb) info inferiors Num Description Executable * 2 <null> 1 process 24970 a.out (gdb) file a.out Reading symbols from a.out...done. (gdb) info inferiors Num Description Executable * 2 <null> a.out 1 process 24970 a.out (gdb) start Temporary breakpoint 1 at 0x411b67: main. (2 locations) Starting program: /home/pedro/a.out Temporary breakpoint 1, main (argc=1, argv=0x7fffffffd908) at /home/pedro/foo.c:10 10 return 0; (gdb) info inferiors Num Description Executable * 2 process 24977 a.out 1 process 24970 a.out Now, disconnect, and reconnect, to emulate the OPs use case. (gdb) disconnect Ending remote debugging. (gdb) info inferiors Num Description Executable * 2 <null> a.out 1 <null> a.out (gdb) tar remote :9999 Remote debugging using :9999 Reading symbols from /lib64/libdl.so.2...Reading symbols from /usr/lib/debug (...) main (argc=1, argv=0x7fffffffd908) at /home/pedro/foo.c:10 10 return 0; (gdb) info inferiors Num Description Executable * 2 process 24977 a.out 1 <null> a.out (gdb) info threads Id Target Id Frame * 2 Thread 24977 main (argc=1, argv=0x7fffffffd908) at /home/pedro/foo.c:10 1 Thread 24970 /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n)
Thanks for looking into this. Quoting you: "Now, disconnect, and reconnect, to emulate the OPs use case." I guess this is the way you "emulate" the fact that processes were created before the GDB connection, or is there something wrong in the way the stub talks to GDB ?
> Quoting you: "Now, disconnect, and reconnect, to emulate the OPs use case." I > guess this is the way you "emulate" the fact that processes were created before > the GDB connection, Correct.
Still happening upstream Version: GNU gdb (GDB) Fedora 7.11-66.fc24 inferior_list is simply not being populated, any hints on where to start looking?
It works first time because thread events will be used, on reconnect though it has to use qfThreadInfo when parsing thread info replies it uses remote_add_inferior, which opts to call inferior_appeared and change the current inferiors pid rather than adding a new inferior, causing the bug Notice how if i switch to inferior 1 before reconnecting, after reconnect it has inferior 2's pid: (gdb) target extended-remote :9999 Remote debugging using :9999 Reading /usr/bin/true from remote target... warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead. Reading /usr/bin/true from remote target... Reading symbols from target:/usr/bin/true...Reading /usr/bin/true.debug from remote target... Reading /usr/bin/.debug/true.debug from remote target... Missing separate debuginfo for target:/usr/bin/true Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/8f/4432d7828c91a0b6650a74e10a4678c8e5fc48.debug Reading symbols from target:/usr/bin/true...(no debugging symbols found)...done. (no debugging symbols found)...done. Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading symbols from target:/lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.23.so.debug...done. done. 0x00007ffff7dd9c80 in _start () from target:/lib64/ld-linux-x86-64.so.2 (gdb) add-inferior Added inferior 2 (gdb) inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] (gdb) file /bin/true Reading symbols from /bin/true...Reading symbols from /bin/true...(no debugging symbols found)...done. (no debugging symbols found)...done. Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.25-5.fc24.x86_64 (gdb) b main Breakpoint 1 at 0x14d0 (2 locations) (gdb) r Starting program: /usr/bin/true Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading /lib64/libc.so.6 from remote target... Thread 2.1 "true" hit Breakpoint 1, 0x00005555555554d0 in main () (gdb) info inferiors Num Description Executable 1 process 6433 target:/usr/bin/true * 2 process 6502 /usr/bin/true (gdb) inferior 1 [Switching to inferior 1 [process 6433] (target:/usr/bin/true)] [Switching to thread 1.1 (Thread 6433.6433)] #0 0x00007ffff7dd9c80 in _start () from target:/lib64/ld-linux-x86-64.so.2 (gdb) disconnect Ending remote debugging. (gdb) target extended-remote :9999 `target:/usr/bin/true' has disappeared; keeping its symbols. Remote debugging using :9999 Reading /lib64/libc.so.6 from remote target... Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading symbols from target:/lib64/libc.so.6...Reading symbols from /usr/lib/debug/usr/lib64/libc-2.23.so.debug...done. done. Reading symbols from target:/lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.23.so.debug...done. done. Reading /lib64/ld-linux-x86-64.so.2 from remote target... 0x00005555555554d0 in main () (gdb) info inferiors Num Description Executable * 1 process 6502 /usr/bin/true 2 <null> /usr/bin/true (gdb) info threads Id Target Id Frame 1.1 Thread 6433.6433 "true" ../../gdb/thread.c:1447: internal-error: switch_to_thread: Assertion `inf != NULL' failed.
It also doesn't clear the inferior list on disconnect? connecting with a new gdb instance shows the same but without the <null> inferior 2 Remote debugging using :9999 Reading /lib64/libc.so.6 from remote target... warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead. Reading /lib64/ld-linux-x86-64.so.2 from remote target... Reading symbols from target:/lib64/libc.so.6...Reading symbols from /usr/lib/debug/usr/lib64/libc-2.23.so.debug...done. done. Reading symbols from target:/lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.23.so.debug...done. done. Reading /lib64/ld-linux-x86-64.so.2 from remote target... 0x00005555555554d0 in main () Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.25-5.fc24.x86_64 (gdb) info inferiors Num Description Executable * 1 process 6502 /usr/bin/true (gdb) info threads Id Target Id Frame 1 Thread 6433.6433 "true" ../../gdb/thread.c:1447: internal-error: switch_to_thread: Assertion `inf != NULL' failed.
Created attachment 9213 [details] Workaround patch Don't know if this is a correct fix but it's sufficient for me to continue working on kdgdb
Created attachment 9217 [details] Possible fix This one might be correct
Someone else tripped on this on IRC too. Repeating and expanding my thoughts here. There's a desire to merge "target remote" and "target extended-remote", and, it's fine to use multiprocess extensions even if you're only remote-debugging a single process, so probably "multiprocess extensions" is not the right predicate. Traditionally, you first load the binary in gdb, and then connect. gdb finds a process already running on the target side, and gdb assumes that that process is running the program you had loaded in gdb. In that case, you want the remote process to be bound to the pre-existing inferior 1. That's what that code is assuming, I think. A fix should probably be based on #1 - is there's no process yet, reuse inferior 1. #2 - add new inferiors for any other new process. #1 could probably also check whether the remote program the process is running is the same program that is loaded in inferior 1. E.g., by checking build ids. There's always tension between gdb trying to be smart, and becoming too smart that it ends up getting in the way, but in this case, it's probably desirable. Though TBC, I wouldn't do it or require it in scope of this bug. (A full fix for this will need a testsuite addition. Probably a testcase based on comment #9.)
I don't know if it makes sense to allow add-inferior and then connect to a remote, surely the remote dictates the inferior list? Actually found the second version of the patch to be rather unstable as some other part of gdb is making assumptions about the current inferior
Just a quick note, Andrew's patch seems to fix crashes I've had with my own GDB stub, too. The stub is used in a similar environment within an emulator project, in which many guest-processes run on top of a high-level emulated OS. I don't have it tested extensively enough to judge the stability of those patches, but currently it seems to be working all fine without any further issues.