Simon noticed that gdb.threads/threads-after-exec.exp was racy. You can consistenly reproduce it (at git hash 319b460545dc79280e2904dcc280057cf71fb753), with: $ taskset -c 0 make check TESTS="gdb.threads/threads-after-exec.exp" This is yet another case of zombie leader detection making things a bit fuzzy. In the passing case, we have: continue Continuing. [New Thread 0x7ffff7bff640 (LWP 603183)] [Thread 0x7ffff7bff640 (LWP 603183) exited] process 603180 is executing new program: .../gdb.threads/threads-after-exec/threads-after-exec While in the failing case, we have (note remarks on the rhs): continue Continuing. [New Thread 0x7ffff7bff640 (LWP 600205)] [Thread 0x7ffff7f95740 (LWP 600202) exited] <<< gdb deletes leader thread, thread 1. [New LWP 600202] <<< gdb adds it back -- this is now thread 3. [Thread 0x7ffff7bff640 (LWP 600205) exited] process 600202 is executing new program: .../threads-after-exec/threads-after-exec [Switching to process 600202] Thread 3 "threads-after-e" hit Catchpoint 2 (exec'd .../gdb.threads/threads-after-exec/threads-after-exec), 0x00007ffff7fe3290 in _start () from /lib64/ld-linux-x86-64.so.2 The testcase only has two threads, yet GDB presented the exec for thread 3. This is GDB deleting the leader (the backend detected it was zombie, due to the exec), and then added it back when it saw the exec event. The testcase isn't expecting that the remaining thread after the exec is any other than thread 1. I'm not sure there's anything we can do easily do on the gdb side. Recreating the leader thread is one option, but I'm not fully sure of the consequences, like e.g., the previous thread 1 will probably still exist in the thread list as THREAD_EXITED, if it was the selected thread. Maybe we can make use of PTRACE_O_TRACEEXIT / PTRACE_EVENT_EXIT, and model a "zombie" state in the core, so if the leader exits, we keep listing it, but GDB wouldn't try to stop that thread or read its registers. After an exec, the zombie thread would go back to being a normal thread. The next question would be how to model this in the remote protocol.
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d2eca84d73a66cf93acbf14522efc835e4446f57 commit d2eca84d73a66cf93acbf14522efc835e4446f57 Author: Pedro Alves <pedro@palves.net> Date: Tue Nov 14 11:47:15 2023 +0000 Fix gdb.threads/threads-after-exec.exp race Simon noticed that gdb.threads/threads-after-exec.exp was racy. You can consistenly reproduce it (at git hash 319b460545dc79280e2904dcc280057cf71fb753), with: $ taskset -c 0 make check TESTS="gdb.threads/threads-after-exec.exp" gdb.log shows: (...) Thread 3 "threads-after-e" hit Catchpoint 2 (exec'd .../gdb.threads/threads-after-exec/threads-after-exec), 0x00007ffff7fe3290 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) PASS: gdb.threads/threads-after-exec.exp: continue until exec info threads Id Target Id Frame * 3 process 1443269 "threads-after-e" 0x00007ffff7fe3290 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) FAIL: gdb.threads/threads-after-exec.exp: info threads (...) maint info linux-lwps LWP Ptid Thread ID 1443269.1443269.0 1.3 (gdb) FAIL: gdb.threads/threads-after-exec.exp: maint info linux-lwps The FAILs happen because the .exp file expects that after the exec, the only thread has GDB thread number 1, but it has instead 3. This is yet another case of zombie leader detection making things a bit fuzzy. In the passing case, we have: continue Continuing. [New Thread 0x7ffff7bff640 (LWP 603183)] [Thread 0x7ffff7bff640 (LWP 603183) exited] process 603180 is executing new program: .../gdb.threads/threads-after-exec/threads-after-exec While in the failing case, we have (note remarks on the rhs): continue Continuing. [New Thread 0x7ffff7bff640 (LWP 600205)] [Thread 0x7ffff7f95740 (LWP 600202) exited] <<< gdb deletes leader thread, thread 1. [New LWP 600202] <<< gdb adds it back -- this is now thread 3. [Thread 0x7ffff7bff640 (LWP 600205) exited] process 600202 is executing new program: .../threads-after-exec/threads-after-exec The testcase only has two threads, yet GDB presented the exec for thread 3. This is GDB deleting the leader (the backend detected it was zombie, due to the exec), and then adding the leader back when it saw the exec event. I've recorded some thoughts about this in PR gdb/31069. For now, this commit just makes the testcase cope with the non-one thread number, as the number is not important for what this test is exercising. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31069 Change-Id: Id80b5c73f09c9e0005efeb494cca5d066ac3bbae