[PATCH v2] [gdb] Fix heap-use-after-free in select_event_lwp

Tom de Vries tdevries@suse.de
Mon Feb 19 15:04:59 GMT 2024


On 2/9/24 16:46, Pedro Alves wrote:
> On 2024-01-23 11:48, Tom de Vries wrote:
> 
>> Since heap-use-after-free is essentially an address sanitizer complaint, I
>> also tried building gdb with -O0 -fsanitize=address, but with this setup it
>> doesn't seem to trigger (0 times out of 10).
>>
>> The heap-use-after-free happens during the following scenario:
>> - linux_nat_wait_1 selects an LWP thread T1 with a status to report.
>> - it sets variable lp to point to the corresponding lwp_info.
>> - it calls stop_callback and stop_wait_callback for all threads
>>    (because !target_is_non_stop_p ()).
>> - it calls select_event_lwp to maybe pick another thread than T1, to prevent
>>    starvation.
>>
>> The problem seems to be the following:
>> - while calling stop_wait_callback for all threads, it also does this for T1.
>>    While doing so, the corresponding lwp_info is deleted (callstack
>>    stop_wait_callback -> wait_lwp -> exit_lwp -> delete_lwp), leaving variable
>>    lp as a dangling pointer.
>> - variable lp is passed to select_event_lwp, which derefences it, which causes
>>    the heap-use-after-free.
>>
>> Note that the comment here mentions "all other LWP's":
>> ...
>>        /* Now stop all other LWP's ...  */
>>        iterate_over_lwps (minus_one_ptid, stop_callback);
>>        /* ... and wait until all of them have reported back that
>>          they're no longer running.  */
>>        iterate_over_lwps (minus_one_ptid, stop_wait_callback);
>> ...
>> which presumably means other than the one in lp, but the iterators
>> don't skip lp.
> 
> I think I'm missing something here.
> 
> The reason the comments say "all other LWP's", and don't bother filtering out LP is that
> lp->stopped should be true at this point, and the callbacks (both stop_callback and stop_wait_callback)
> check that flag, and do nothing if set.  I.e., they skip already-stopped threads, so they should
> skip LP.
> 
> It sounds like we were about to report a stop for a thread that isn't marked as stopped?
> Now it looks to me that _that_ would be the bug to fix.

Hi Pedro,

thanks for the review.

This patch adds an assert to catch the bug you mention, and a fix in 
wait_lwp:
...
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index e91c57ba239..5022da9abd2 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -2210,6 +2210,7 @@ wait_lwp (struct lwp_info *lp)
  		 core.  Store it in lp->waitstatus, because lp->status
  		 would be ambiguous (W_EXITCODE(0,0) == 0).  */
  	      lp->waitstatus = host_status_to_waitstatus (status);
+	      lp->stopped = 1;
  	      return 0;
  	    }

@@ -3368,6 +3369,7 @@ linux_nat_wait_1 (ptid_t ptid, struct 
target_waitstatus *ourstatus,
      }

    gdb_assert (lp);
+  gdb_assert (lp->stopped);

    status = lp->status;
    lp->status = 0;
...

This fixes the problem observed in the PR, and passes testing on 
x86_64-linux and aarch64-linux.

WDYT?

Thanks,
- Tom

> 
> Pedro Alves
> 



More information about the Gdb-patches mailing list