[PATCH v2] [gdb] Fix heap-use-after-free in select_event_lwp
Tom de Vries
tdevries@suse.de
Mon Feb 19 15:04:59 GMT 2024
On 2/9/24 16:46, Pedro Alves wrote:
> On 2024-01-23 11:48, Tom de Vries wrote:
>
>> Since heap-use-after-free is essentially an address sanitizer complaint, I
>> also tried building gdb with -O0 -fsanitize=address, but with this setup it
>> doesn't seem to trigger (0 times out of 10).
>>
>> The heap-use-after-free happens during the following scenario:
>> - linux_nat_wait_1 selects an LWP thread T1 with a status to report.
>> - it sets variable lp to point to the corresponding lwp_info.
>> - it calls stop_callback and stop_wait_callback for all threads
>> (because !target_is_non_stop_p ()).
>> - it calls select_event_lwp to maybe pick another thread than T1, to prevent
>> starvation.
>>
>> The problem seems to be the following:
>> - while calling stop_wait_callback for all threads, it also does this for T1.
>> While doing so, the corresponding lwp_info is deleted (callstack
>> stop_wait_callback -> wait_lwp -> exit_lwp -> delete_lwp), leaving variable
>> lp as a dangling pointer.
>> - variable lp is passed to select_event_lwp, which derefences it, which causes
>> the heap-use-after-free.
>>
>> Note that the comment here mentions "all other LWP's":
>> ...
>> /* Now stop all other LWP's ... */
>> iterate_over_lwps (minus_one_ptid, stop_callback);
>> /* ... and wait until all of them have reported back that
>> they're no longer running. */
>> iterate_over_lwps (minus_one_ptid, stop_wait_callback);
>> ...
>> which presumably means other than the one in lp, but the iterators
>> don't skip lp.
>
> I think I'm missing something here.
>
> The reason the comments say "all other LWP's", and don't bother filtering out LP is that
> lp->stopped should be true at this point, and the callbacks (both stop_callback and stop_wait_callback)
> check that flag, and do nothing if set. I.e., they skip already-stopped threads, so they should
> skip LP.
>
> It sounds like we were about to report a stop for a thread that isn't marked as stopped?
> Now it looks to me that _that_ would be the bug to fix.
Hi Pedro,
thanks for the review.
This patch adds an assert to catch the bug you mention, and a fix in
wait_lwp:
...
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index e91c57ba239..5022da9abd2 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -2210,6 +2210,7 @@ wait_lwp (struct lwp_info *lp)
core. Store it in lp->waitstatus, because lp->status
would be ambiguous (W_EXITCODE(0,0) == 0). */
lp->waitstatus = host_status_to_waitstatus (status);
+ lp->stopped = 1;
return 0;
}
@@ -3368,6 +3369,7 @@ linux_nat_wait_1 (ptid_t ptid, struct
target_waitstatus *ourstatus,
}
gdb_assert (lp);
+ gdb_assert (lp->stopped);
status = lp->status;
lp->status = 0;
...
This fixes the problem observed in the PR, and passes testing on
x86_64-linux and aarch64-linux.
WDYT?
Thanks,
- Tom
>
> Pedro Alves
>
More information about the Gdb-patches
mailing list