[PATCH 3/4] gdb/dwarf: don't enqueue CU in maybe_queue_comp_unit if already expanded

Simon Marchi simon.marchi@polymtl.ca
Thu Jan 21 02:15:15 GMT 2021


Hi Tom,

Do you have an opinion on this?

Simon

On 2020-12-10 12:40 p.m., Simon Marchi via Gdb-patches wrote:
> On 2020-12-10 9:52 a.m., Simon Marchi via Gdb-patches wrote:
>>
>>
>> On 2020-12-09 4:24 p.m., Tom Tromey wrote:
>>> Re-reading this code, I realized again that the return value of this
>>> function does not really make sense to me.  The intro says:
>>>
>>>    The result is non-zero if PER_CU was queued, otherwise the result is zero
>>>    meaning either PER_CU is already queued or it is already loaded.
>>>
>>> ... but it seems unlikely for callees to want to detect that just this
>>> call caused the enqueueing.
>>
>> Ok, re-reading that comment, I think I understand things a bit differently
>> than I did previously:
>>
>> /* If PER_CU is not yet queued, add it to the queue.
>>    If DEPENDENT_CU is non-NULL, it has a reference to PER_CU so add a
>>    dependency.
>>    The result is non-zero if PER_CU was queued, otherwise the result is zero
>>    meaning either PER_CU is already queued or it is already loaded.
>>
>>    N.B. There is an invariant here that if a CU is queued then it is loaded.
>>    The caller is required to load PER_CU if we return non-zero.  */
>>
>> The premise is: there is the invariant that if a CU is queued for expansion,
>> its DIEs are loaded.  If maybe_queue_comp_unit enqueues a CU for expansion
>> whose DIEs are not loaded, it returns 1 to its caller to ask "please load the
>> DIEs for that CU because I just enqueued it and if you don't the invariant
>> will get violated and we'll get in trouble".  So if a CU is already expanded
>> and maybe_queue_comp_unit doesn't enqueue it, it makes sense that it returns
>> 0, because it doesn't *require* the caller to load the DIEs.
>>
>> However, that means that the caller shouldn't rely on maybe_queue_comp_unit's
>> return value to determine whether the CU's DIEs are currently loaded, because:
>>
>> 1. whether maybe_queue_comp_unit requires it to load the CU's DIEs
>> 2. whether the CU's DIEs are currently loaded
>>
>> are two different things.
>>
>> If the caller wants to know #2, because it itself needs to ensure the DIEs are
>> loaded, it should not rely on maybe_queue_comp_unit's return value, but
>> instead check itself with dwarf2_per_objfile::get_cu.
>>
>> I'll go over my patch and think about it a little more.
>>
>> Simon
>>
> 
> Hi Tom,
> 
> Following my reconsideration of what the return value of maybe_queue_comp_unit
> means and the original intent of the function, here's an updated patch.
> 
> The differences are:
> 
> - New comment on maybe_queue_comp_unit, hopefully making it clearer.
> - Update logic in maybe_queue_comp_unit: if the CU does not get enqueued because
>   it is already expanded and its DIEs are not loaded, return false.  Because in
>   this case, maybe_queue_comp_unit doesn't _need_ the caller to load the DIEs.
> - Update callers follow_die_sig_1 and follow_die_ref to not rely on
>   maybe_queue_comp_unit to tell them whether DIEs are loaded right now.
> 
> 
> From 70ed5a2468e2776e887c3481b89945347a856089 Mon Sep 17 00:00:00 2001
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Wed, 11 Nov 2020 21:30:22 -0500
> Subject: [PATCH] gdb/dwarf: don't enqueue CU in maybe_queue_comp_unit if
>  already expanded
> 
> The previous commit log described how items could be left lingering in
> the dwarf2_per_bfd::queue and how that could cause trouble.
> 
> This patch fixes the issue by changing maybe_queue_comp_unit so that it
> doesn't put a CU in the to-expand queue if that CU is already expanded.
> This will make it so that when dwarf2_fetch_die_type_sect_off calls
> follow_die_offset and maybe_queue_comp_unit, it won't enqueue the target
> CU, because it will see the CU is already expanded.
> 
> This assumes that if a CU is dwarf2_fetch_die_type_sect_off's target CU,
> it will have previously been expanded.  I think it is the case, but I
> can't be 100% sure.  If that's not true, the assertions added in the
> following patch will catch it, and it means we'll have to re-think a bit
> more how things work (it wouldn't be well handled at all today anyway).
> 
> This fixes something else in maybe_queue_comp_unit that looks wrong.
> Imagine the DIEs of a CU are loaded in memory, but that CU is not
> expanded.  In that case, maybe_queue_comp_unit will use this early
> return:
> 
>   /* If the compilation unit is already loaded, just mark it as
>      used.  */
>   dwarf2_cu *cu = per_objfile->get_cu (per_cu);
>   if (cu != nullptr)
>     {
>       cu->last_used = 0;
>       return 0;
>     }
> 
> ... so the CU won't be queued for expansion.  Whether the DIEs of a CU
> are loaded in memory and whether that CU is expanded are two orthogonal
> things, but that function appears to mix them.  So, move the queuing
> above that check / early return, so that if the CU's DIEs are loaded in
> memory but the CU is not expanded yet, it gets enqueued.
> 
> I tried to improve maybe_queue_comp_unit's documentation to clarify what
> the return value means.  By clarifying this, I noticed that two callers
> (follow_die_offset and follow_die_sig_1) access the CU's DIEs after
> calling maybe_queue_comp_unit, only relying on maybe_queue_comp_unit's
> return value to tell whether DIEs need to be loaded first or not.  As
> explained in the new comment, this is problematic:
> maybe_queue_comp_unit's return value doesn't tell whether DIEs are
> currently loaded, it means whether maybe_queue_comp_unit requires the
> caller to load them.  If the CU is already expanded but the DIEs to have
> been freed, maybe_queue_comp_unit returns 0, meaning "I don't need you
> to load the DIEs".  So if these two functions (follow_die_offset and
> follow_die_sig_1) need to access the DIEs in any case, for their own
> usage, they should make sure to load them if they are not loaded
> already.  I therefore added an extra check to the condition they use,
> making it so they will always load the DIEs if they aren't already.
> 
> From what I found, other callers don't care for the CU's DIEs, they call
> maybe_queue_comp_unit to ensure the CU gets expanded eventually, but
> don't care for it after that.
> 
> gdb/ChangeLog:
> 
> 	PR gdb/26828
> 	* dwarf2/read.c (maybe_queue_comp_unit): Check if CU is expanded
> 	to decide whether or not to enqueue it for expansion.
> 	(follow_die_offset, follow_die_sig_1): Ensure we load the DIEs
> 	after calling maybe_queue_comp_unit.
> 
> Change-Id: Id98c6b60669f4b4b21b9be16d0518fc62bdf686a
> ---
>  gdb/dwarf2/read.c | 70 +++++++++++++++++++++++++++++++++++------------
>  1 file changed, 53 insertions(+), 17 deletions(-)
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index f199229985e9..62676cd91492 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -9153,14 +9153,30 @@ queue_comp_unit (dwarf2_per_cu_data *per_cu,
>    per_cu->per_bfd->queue.emplace (per_cu, per_objfile, pretend_language);
>  }
>  
> -/* If PER_CU is not yet queued, add it to the queue.
> +/* If PER_CU is not yet expanded of queued for expansion, add it to the queue.
> +
>     If DEPENDENT_CU is non-NULL, it has a reference to PER_CU so add a
>     dependency.
> -   The result is non-zero if PER_CU was queued, otherwise the result is zero
> -   meaning either PER_CU is already queued or it is already loaded.
>  
> -   N.B. There is an invariant here that if a CU is queued then it is loaded.
> -   The caller is required to load PER_CU if we return non-zero.  */
> +   Return true if maybe_queue_comp_unit requires the caller to load the CU's
> +   DIEs, false otherwise.
> +
> +   Explanation: there is an invariant that if a CU is queued for expansion
> +   (present in `dwarf2_per_bfd::queue`), then its DIEs are loaded
> +   (a dwarf2_cu object exists for this CU, and `dwarf2_per_objfile::get_cu`
> +   returns non-nullptr).  If the CU gets enqueued by this function but its DIEs
> +   are not yet loaded, the the caller must load the CU's DIEs to ensure the
> +   invariant is respected.
> +
> +   The caller is therefore not required to load the CU's DIEs (we return false)
> +   if:
> +
> +     - the CU is already expanded, and therefore does not get enqueued
> +     - the CU gets enqueued for expansion, but its DIEs are already loaded
> +
> +   Note that the caller should not use this function's return value as an
> +   indicator of whether the CU's DIEs are loaded right now, it should check
> +   that by calling `dwarf2_per_objfile::get_cu` instead.  */
>  
>  static int
>  maybe_queue_comp_unit (struct dwarf2_cu *dependent_cu,
> @@ -9191,22 +9207,32 @@ maybe_queue_comp_unit (struct dwarf2_cu *dependent_cu,
>        /* Verify the invariant that if a CU is queued for expansion, its DIEs are
>  	 loaded.  */
>        gdb_assert (per_objfile->get_cu (per_cu) != nullptr);
> +
> +      /* If the CU is queued for expansion, it should not already be
> +	 expanded.  */
> +      gdb_assert (!per_objfile->symtab_set_p (per_cu));
> +
> +      /* The DIEs are already loaded, the caller doesn't need to do it.  */
>        return 0;
>      }
>  
> +  bool queued = false;
> +  if (!per_objfile->symtab_set_p (per_cu))
> +    {
> +      /* Add it to the queue.  */
> +      queue_comp_unit (per_cu, per_objfile,  pretend_language);
> +      queued = true;
> +    }
> +
>    /* If the compilation unit is already loaded, just mark it as
>       used.  */
>    dwarf2_cu *cu = per_objfile->get_cu (per_cu);
>    if (cu != nullptr)
> -    {
> -      cu->last_used = 0;
> -      return 0;
> -    }
> +    cu->last_used = 0;
>  
> -  /* Add it to the queue.  */
> -  queue_comp_unit (per_cu, per_objfile,  pretend_language);
> -
> -  return 1;
> +  /* Ask the caller to load the CU's DIEs if the CU got enqueued for expansion
> +     and the DIEs are not already loaded.  */
> +  return queued && cu == nullptr;
>  }
>  
>  /* Process the queue.  */
> @@ -23568,12 +23594,18 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
>  				 sect_offset_str (per_cu->sect_off),
>  				 per_objfile->get_cu (per_cu) != nullptr);
>  
> -      /* If necessary, add it to the queue and load its DIEs.  */
> -      if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
> +      /* If necessary, add it to the queue and load its DIEs.
> +
> +	 Even if maybe_queue_comp_unit doesn't require us to load the CU's DIEs,
> +	 it doesn't mean they are currently loaded.  Since we require them
> +	 to be loaded, we must check for ourselves.  */
> +      if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language)
> +	  || per_objfile->get_cu (per_cu) == nullptr)
>  	load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
>  			     false, cu->language);
>  
>        target_cu = per_objfile->get_cu (per_cu);
> +      gdb_assert (target_cu != nullptr);
>      }
>    else if (cu->dies == NULL)
>      {
> @@ -23947,10 +23979,14 @@ follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type,
>       we can get here for DW_AT_imported_declaration where we need
>       the DIE not the type.  */
>  
> -  /* If necessary, add it to the queue and load its DIEs.  */
> +  /* If necessary, add it to the queue and load its DIEs.
>  
> +     Even if maybe_queue_comp_unit doesn't require us to load the CU's DIEs,
> +     it doesn't mean they are currently loaded.  Since we require them
> +     to be loaded, we must check for ourselves.  */
>    if (maybe_queue_comp_unit (*ref_cu, &sig_type->per_cu, per_objfile,
> -			     language_minimal))
> +			     language_minimal)
> +      || per_objfile->get_cu (&sig_type->per_cu) == nullptr)
>      read_signatured_type (sig_type, per_objfile);
>  
>    sig_cu = per_objfile->get_cu (&sig_type->per_cu);
> 


More information about the Gdb-patches mailing list