Bug 31259 - [gdb] ThreadSanitizer: heap-use-after-free linux-nat.c:2809 in select_event_lwp
Summary: [gdb] ThreadSanitizer: heap-use-after-free linux-nat.c:2809 in select_event_lwp
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 14.1
: P2 normal
Target Milestone: 15.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-18 09:12 UTC by Tom de Vries
Modified: 2024-02-27 08:20 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2024-01-18 09:12:03 UTC
When building gdb with -O0 -fsanitize=thread, and run test-case gdb.base/vfork-follow-parent.exp, I get:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: resolution_method=schedule-multiple: set schedule-multiple on
continue
Continuing.
[New inferior 2 (process 600810)]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
process 600810 is executing new program: /home/vries/gdb/build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vforked-prog
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Thread 0xfffff7fb4020 (LWP 600810) exited]
==================
[1m[31mWARNING: ThreadSanitizer: heap-use-after-free (pid=600786)
[1m[0m[1m[34m  Write of size 4 at 0xffffeea1acfc by main thread:
[1m[0m    #0 select_event_lwp /home/vries/gdb/src/gdb/linux-nat.c:2809 (gdb+0xb07b14) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #1 linux_nat_wait_1 /home/vries/gdb/src/gdb/linux-nat.c:3389 (gdb+0xb09928) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #2 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/linux-nat.c:3560 (gdb+0xb0a480) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #3 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/linux-thread-db.c:1402 (gdb+0xb32e10) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #4 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/target.c:2571 (gdb+0xfb3d38) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #5 do_target_wait_1 /home/vries/gdb/src/gdb/infrun.c:4120 (gdb+0xa99430) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #6 operator() /home/vries/gdb/src/gdb/infrun.c:4179 (gdb+0xa995dc) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #7 do_target_wait /home/vries/gdb/src/gdb/infrun.c:4198 (gdb+0xa99928) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #8 fetch_inferior_event() /home/vries/gdb/src/gdb/infrun.c:4629 (gdb+0xa9acc4) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #9 inferior_event_handler(inferior_event_type) /home/vries/gdb/src/gdb/inf-loop.c:42 (gdb+0xa6a734) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #10 handle_target_event /home/vries/gdb/src/gdb/linux-nat.c:4357 (gdb+0xb0cb4c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #11 handle_file_event /home/vries/gdb/src/gdbsupport/event-loop.cc:573 (gdb+0x1cf5678) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #12 gdb_wait_for_event /home/vries/gdb/src/gdbsupport/event-loop.cc:694 (gdb+0x1cf5d3c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #13 gdb_do_one_event(int) /home/vries/gdb/src/gdbsupport/event-loop.cc:217 (gdb+0x1cf3ee8) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #14 start_event_loop /home/vries/gdb/src/gdb/main.c:408 (gdb+0xb79354) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #15 captured_command_loop /home/vries/gdb/src/gdb/main.c:472 (gdb+0xb79584) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #16 captured_main /home/vries/gdb/src/gdb/main.c:1342 (gdb+0xb7b99c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #17 gdb_main(captured_main_args*) /home/vries/gdb/src/gdb/main.c:1361 (gdb+0xb7ba4c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #18 main /home/vries/gdb/src/gdb/gdb.c:39 (gdb+0x423ce8) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)

[1m[34m  Previous write of size 8 at 0xffffeea1acf8 by main thread:
[1m[0m    #0 operator delete(void*, unsigned long) <null> (libtsan.so.2+0x8fb14) (BuildId: fe872cc4563474b7ad67d63a019aa94e1e0df888)
    #1 delete_lwp /home/vries/gdb/src/gdb/linux-nat.c:849 (gdb+0xb00d04) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #2 exit_lwp /home/vries/gdb/src/gdb/linux-nat.c:924 (gdb+0xb01104) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #3 wait_lwp /home/vries/gdb/src/gdb/linux-nat.c:2224 (gdb+0xb058bc) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #4 stop_wait_callback /home/vries/gdb/src/gdb/linux-nat.c:2458 (gdb+0xb06760) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #5 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const /home/vries/gdb/src/gdb/../gdbsupport/function-view.h:326 (gdb+0xb12f68) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #6 gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) /home/vries/gdb/src/gdb/../gdbsupport/function-view.h:320 (gdb+0xb12fd0) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #7 gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const /home/vries/gdb/src/gdb/../gdbsupport/function-view.h:289 (gdb+0xb11348) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #8 iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) /home/vries/gdb/src/gdb/linux-nat.c:879 (gdb+0xb00ed0) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #9 linux_nat_wait_1 /home/vries/gdb/src/gdb/linux-nat.c:3382 (gdb+0xb098b0) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #10 linux_nat_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/linux-nat.c:3560 (gdb+0xb0a480) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #11 thread_db_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/linux-thread-db.c:1402 (gdb+0xb32e10) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #12 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>) /home/vries/gdb/src/gdb/target.c:2571 (gdb+0xfb3d38) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #13 do_target_wait_1 /home/vries/gdb/src/gdb/infrun.c:4120 (gdb+0xa99430) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #14 operator() /home/vries/gdb/src/gdb/infrun.c:4179 (gdb+0xa995dc) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #15 do_target_wait /home/vries/gdb/src/gdb/infrun.c:4198 (gdb+0xa99928) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #16 fetch_inferior_event() /home/vries/gdb/src/gdb/infrun.c:4629 (gdb+0xa9acc4) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #17 inferior_event_handler(inferior_event_type) /home/vries/gdb/src/gdb/inf-loop.c:42 (gdb+0xa6a734) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #18 handle_target_event /home/vries/gdb/src/gdb/linux-nat.c:4357 (gdb+0xb0cb4c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #19 handle_file_event /home/vries/gdb/src/gdbsupport/event-loop.cc:573 (gdb+0x1cf5678) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #20 gdb_wait_for_event /home/vries/gdb/src/gdbsupport/event-loop.cc:694 (gdb+0x1cf5d3c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #21 gdb_do_one_event(int) /home/vries/gdb/src/gdbsupport/event-loop.cc:217 (gdb+0x1cf3ee8) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #22 start_event_loop /home/vries/gdb/src/gdb/main.c:408 (gdb+0xb79354) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #23 captured_command_loop /home/vries/gdb/src/gdb/main.c:472 (gdb+0xb79584) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #24 captured_main /home/vries/gdb/src/gdb/main.c:1342 (gdb+0xb7b99c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #25 gdb_main(captured_main_args*) /home/vries/gdb/src/gdb/main.c:1361 (gdb+0xb7ba4c) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)
    #26 main /home/vries/gdb/src/gdb/gdb.c:39 (gdb+0x423ce8) (BuildId: 6dc308d9bc2da51d7adf979315fabd66fb46e8a3)

SUMMARY: ThreadSanitizer: heap-use-after-free /home/vries/gdb/src/gdb/linux-nat.c:2809 in select_event_lwp
==================
FAIL: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
...
Comment 2 Sourceware Commits 2024-02-26 15:28:53 UTC
The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3d2d21728b6db4430ff168ee27e12fc6e2627fad

commit 3d2d21728b6db4430ff168ee27e12fc6e2627fad
Author: Pedro Alves <pedro@palves.net>
Date:   Wed Feb 21 16:23:55 2024 +0000

    [gdb] Fix heap-use-after-free in select_event_lwp
    
    PR gdb/31259 reveals one scenario where we run into a
    heap-use-after-free reported by thread sanitizer, while running
    gdb.base/vfork-follow-parent.exp.
    
    The heap-use-after-free happens during the following scenario:
    
     - linux_nat_wait_1 is about to return an event for T2.  It stops all
       other threads, and while doing so, stop_wait_callback -> wait_lwp
       sees T1 exit, and decides to leave the exit event pending.  It
       should have set the lp->stopped flag too, but does not -- this is
       the bug.
    
     - The event for T2 is reported, is processed by infrun, and we're
       back at linux_nat_wait_1.
    
     - linux_nat_wait_1 selects LWP T1 with the pending exit status to
       report.
    
     - it sets variable lp to point to the corresponding lwp_info.
    
     - it calls stop_callback and stop_wait_callback for all threads
       (because !target_is_non_stop_p ()).
    
     - it calls select_event_lwp to maybe pick another thread than T1, to
       prevent starvation.
    
    The problem is the following:
    
     - while calling stop_wait_callback for all threads, it also does this
       for T1.  While doing so, the corresponding lwp_info is deleted
       (callstack stop_wait_callback -> wait_lwp -> exit_lwp ->
       delete_lwp), leaving variable lp as a dangling pointer.
    
     - variable lp is passed to select_event_lwp, which derefences it,
       which causes the heap-use-after-free.
    
    Note that the comment here mentions "all other LWP's":
    ...
          /* Now stop all other LWP's ...  */
          iterate_over_lwps (minus_one_ptid, stop_callback);
          /* ... and wait until all of them have reported back that
            they're no longer running.  */
          iterate_over_lwps (minus_one_ptid, stop_wait_callback);
    ...
    
    The reason the comments say "all other LWP's", and doesn't bother
    filtering out LP is that lp->stopped should be true at this point, and
    the callbacks (both stop_callback and stop_wait_callback) check that
    flag, and do nothing if set.  I.e., they skip already-stopped threads,
    so they should skip LP.
    
    In this particular scenario, though, we missed setting the stopped
    flag right in the first step described above, so LP was iterated over
    incorrectly.
    
    The fix is to make wait_lwp set the lp->stopped flag when it decides
    to leave the exit event pending.  However, going a bit further,
    gdbserver has a mark_lwp_dead function to centralize setting up
    various lwp flags such that the rest of the code doesn't mishandle
    them, and it seems like a good idea to do a similar thing in gdb as
    well.  That is what this patch does.
    
    PR gdb/31259
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31259
    Co-Authored-By: Tom de Vries <tdevries@suse.de>
    Change-Id: I4a6169976f89bf714c478cbb2b7d4c32365e62a9
Comment 3 Tom Tromey 2024-02-27 00:00:47 UTC
I believe the patch fixed this.
Comment 4 Tom de Vries 2024-02-27 08:20:09 UTC
I reproduced this with gdb-14-branch (and assuming that the commit 9c02b52532 introduced the problem, this is a regression since 7.9.0).

Should we backport this fix for 14.2?