[PATCH v2 3/8] Add support for non-contiguous blocks to find_pc_partial_function

Simon Marchi simon.marchi@ericsson.com
Tue Aug 21 15:57:00 GMT 2018


On 2018-08-13 08:04 PM, Kevin Buettner wrote:
> This change adds an optional output parameter BLOCK to
> find_pc_partial_function.  If BLOCK is non-null, then *BLOCK will be
> set to the address of the block corresponding to the function symbol
> if such a symbol was found during lookup.  Otherwise it's set to a
> NULL value.  Callers may wish to use the block information to
> determine whether the block contains any non-contiguous ranges.  The
> caller may also iterate over or examine those ranges.
> 
> When I first started looking at the broken stepping behavior associated
> with functions w/ non-contiguous ranges, I found that I could "fix"
> the problem by disabling the find_pc_partial_function cache.  It would
> sometimes happen that the PC passed in would be between the low and
> high cache values, but would be in some other function that happens to
> be placed in between the ranges for the cached function.  This caused
> incorrect values to be returned.
> 
> So dealing with this cache turns out to be very important for fixing
> this problem.  I explored three different ways of dealing with the cache.
> 
> My first approach was to clear the cache when a block was encountered
> with more than one range.  This would cause the non-cache pathway to
> be executed on the next call to find_pc_partial_function.
> 
> Another approach, which I suspect is slightly faster, checks to see
> whether the PC is within one of the ranges associated with the cached
> block.  If so, then the cached values can be used.  It falls back to
> the original behavior if there is no cached block.
> 
> The current approach, suggested by Simon Marchi, is to restrict the
> low/high pc values recorded for the cache to the beginning and end of
> the range containing the PC value under consideration.  This allows us
> to retain the simple (and fast) test for determining whether the
> memoized (cached) values apply to the PC passed to
> find_pc_partial_function.
> 
> Another choice that had to be made regards setting *ADDRESS and
> *ENDADDR.  There are three possibilities which might make sense:
> 
> 1) *ADDRESS and *ENDADDR represent the lowest and highest address
>    of the function.
> 2) *ADDRESS and *ENDADDR are set to the start and end address of
>    the range containing the entry pc.
> 3) *ADDRESS and *ENDADDR are set to the start and end address of
>    the range in which PC is found.
> 
> An earlier version of this patch implemented option #1.  I found out
> that it's not very useful though and, in fact, returns results that
> are incorrect when used in the context of determining the start and
> end of the function for doing prologue analysis.  While debugging a
> function in which the entry pc was in the second range (of a function
> containing two non-contiguous ranges), I noticed that
> amd64_skip_prologue called find_pc_partial_function - the returned
> start address was set to the beginning of the first range.  This is
> incorrect for this function.  What was also interesting was that this
> first invocation of find_pc_partial_function correctly set the cache
> for the PC on which it had been invoked, but a slightly later call
> from skip_prologue_using_sal could not use this cached value because
> it was now being used to lookup the very lowest address of the
> function - which is in a range not containing the entry pc.
> 
> Option #2 is attractive as it would provide a desirable result
> when used in the context of prologue analysis.  However, many callers,
> including some which do prologue analysis want the condition
> *ADDRESS <= PC < *ENDADDR to hold.  This will not be the case when
> find_pc_partial_function is called on a PC that's in a non-entry-pc
> range.  A later patch to this series adds find_pc_partial_entry_range
> as a wrapper of find_pc_partial_function.
> 
> Option #3 causes the *ADDRESS <= PC < *ENDADDR property to hold.  If
> find_pc_partial_function is called with a PC that's within entry pc's
> range, then it will correctly return the limits of that range.  So, if
> the result of a minsym search is passed to find_pc_partial_function
> to find the limits, then correct results will be achieved.  Returned
> limits (for prologue analysis) won't be correct when PC is within some
> other (non-entry-pc) range.  I don't yet know how big of a problem
> this might be; I'm guessing that it won't be a serious problem - if a
> compiler generates functions which have non-contiguous ranges, then it
> also probably generates DWARF2 CFI which makes a lot of the old
> prologue analysis moot.
> 
> I've implemented option #3 for this version of the patch.  I don't see
> any regressions for x86-64.  Moreover, I don't expect to see
> regressions for other targets either simply because
> find_pc_partial_function behaves the same as it did before for the
> contiguous address range case.  That said, there may be some
> adjustments needed if GDB encounters a function with requires prologue
> analysis which also occupies non-contiguous ranges.

LGTM, just some nits.

As I mentioned previously, cache_pc_function_{low,high} could be renamed
to reflect that they now represent the low/high addresses of the range.
If you think it's not necessary, it's also fine, but I just want to make
sure the comment didn't just fall through the cracks.

> @@ -228,17 +219,64 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
>    if (compunit_symtab != NULL)
>      {
>        /* Checking whether the msymbol has a larger value is for the
> -	 "pathological" case mentioned in print_frame_info.  */
> +	 "pathological" case mentioned in stack.c:find_frame_funname.
> +
> +	 We use BLOCK_ENTRY_PC instead of BLOCK_START_PC for this
> +	 comparison because the minimal symbol should refer to the
> +	 function's entry pc which is not necessarily the lowest
> +	 address of the function.  This will happen when the function
> +	 has more than one range and the entry pc is not within the
> +	 lowest range of addresses.  */
>        f = find_pc_sect_function (mapped_pc, section);
>        if (f != NULL
>  	  && (msymbol.minsym == NULL
> -	      || (BLOCK_START (SYMBOL_BLOCK_VALUE (f))
> +	      || (BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (f))
>  		  >= BMSYMBOL_VALUE_ADDRESS (msymbol))))
>  	{
> -	  cache_pc_function_low = BLOCK_START (SYMBOL_BLOCK_VALUE (f));
> -	  cache_pc_function_high = BLOCK_END (SYMBOL_BLOCK_VALUE (f));
> +	  const struct block *b = SYMBOL_BLOCK_VALUE (f);
> +
> +	  cache_pc_function_low = BLOCK_START (b);
> +	  cache_pc_function_high = BLOCK_END (b);

I think these last two lines are unnecessary, because we are guaranteed to set them
lower in any case.

> @@ -292,12 +331,15 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
>  	     then add one to that.  */
>  
>  	  *endaddr = 1 + overlay_unmapped_address (cache_pc_function_high - 1,
> -						   section);
> +	                                           section);

Spurious change here.

Simon



More information about the Gdb-patches mailing list