[PATCH v3 2/8] Remove static buffer from ada_decode

Thu May 30 00:08:00 GMT 2019

On 5/29/19 10:29 PM, Tom Tromey wrote:
> ada_decode has a static buffer, which means it is not thread-safe.
> This patch removes the buffer and adds a parameter so that the storage
> can be managed by the caller.

Is there a reason the buffer is a unique_ptr instead of a std::string?
Or a reason that a std::string could be better?  Wondering
out loud --  the small string optimization crossed my mind, though
probably Ada symbol names rarely fit in it.

I wonder whether an interface like the below be preferred?

 class ada_decoder_storage
 {
 public:
   ada_decoder_storage () = default;

   /* May return a pointer to m_storage, therefore the result
      is only guaranteed valid for as long as this
      ada_decoder_storage object is live, or until the next
      decode call.  */
   const char *decode (const char *encoded);

 private:
   gdb::unique_xmalloc_ptr<char> m_storage;
 };

That would allow hiding the storage type from clients, like:

  ada_sniff_from_mangled_name (const char *mangled, char **out)
  {
 -  const char *demangled = ada_decode (mangled);
 +  ada_decoder_storage decoder;
 +  const char *demangled = decoder.decode (mangled);

With that, sometimes you wouldn't need to name a temp, like in:

      error (_("Could not find renamed variable: %s"),
	     ada_decode_storage ().decode (name));

Note: while I was working on the C++ wildmatching stuff a couple years
back, I ran into pretty simple C/C++ use case that consistently showed
ada_decode on top of perf profiles, I think coming from sniffing the
minsym's language.  I wrote some patches optimizing it, but in the end since
they weren't required for the wildmatching stuff, I didn't include them
in the upstream submission back then and never found the time to submit
them since:

 https://github.com/palves/gdb/commits/palves/ada-decode-speedups

Unfortunately apparently I didn't write down anywhere that the usecase
was...  And worse, that branch is causing regressions now, but I'm quite
sure that it didn't back then...  Oh well.  Just a FYI anyway.

The reason I'm bringing this up, is that your patch will introduce
heap allocations for the storage, and that reminded me of the desire to
optimize ada_decode.  I would hope that the buffer/storage
can be reused when we need to decode many symbols.

Anyway, again I don't have a testcase handy, so take it as a FYI.
Might be the recent-ish glibc malloc improvements manage to reuse
the same heap block in such case, making the issue moot in practice.

Thanks,
Pedro Alves

> 
> gdb/ChangeLog
> 2019-05-29  Tom Tromey  <tom@tromey.com>
> 
> 	* ada-varobj.c (ada_varobj_describe_simple_array_child): Update.
> 	* ada-lang.h (ada_decode): Add "storage" parameter.
> 	* ada-lang.c (ada_decode): Add "storage" parameter.
> 	(ada_decode_symbol, ada_la_decode, ada_sniff_from_mangled_name)
> 	(is_valid_name_for_wild_match, name_matches_regex)
> 	(ada_add_global_exceptions): Update.
> 	* ada-exp.y (write_object_renaming): Update.
> ---
>  gdb/ChangeLog    | 10 +++++++++
>  gdb/ada-exp.y    |  6 +++++-
>  gdb/ada-lang.c   | 55 ++++++++++++++++++++++++++++++------------------
>  gdb/ada-lang.h   |  3 ++-
>  gdb/ada-varobj.c |  3 ++-
>  5 files changed, 54 insertions(+), 23 deletions(-)
> 
> diff --git a/gdb/ada-exp.y b/gdb/ada-exp.y
> index f7ce27aca35..4771ec0ed56 100644
> --- a/gdb/ada-exp.y
> +++ b/gdb/ada-exp.y
> @@ -816,7 +816,11 @@ write_object_renaming (struct parser_state *par_state,
>  				 renamed_entity_len);
>    ada_lookup_encoded_symbol (name, orig_left_context, VAR_DOMAIN, &sym_info);
>    if (sym_info.symbol == NULL)
> -    error (_("Could not find renamed variable: %s"), ada_decode (name));
> +    {
> +      gdb::unique_xmalloc_ptr<char> storage;
> +      error (_("Could not find renamed variable: %s"),
> +	     ada_decode (name, &storage));
> +    }
>    else if (SYMBOL_CLASS (sym_info.symbol) == LOC_TYPEDEF)
>      /* We have a renaming of an old-style renaming symbol.  Don't
>         trust the block information.  */
> diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
> index 99c099aa07d..75325100dc9 100644
> --- a/gdb/ada-lang.c
> +++ b/gdb/ada-lang.c
> @@ -1106,22 +1106,18 @@ ada_remove_po_subprogram_suffix (const char *encoded, int *len)
>  
>  /* If ENCODED follows the GNAT entity encoding conventions, then return
>     the decoded form of ENCODED.  Otherwise, return "<%s>" where "%s" is
> -   replaced by ENCODED.
> -
> -   The resulting string is valid until the next call of ada_decode.
> -   If the string is unchanged by decoding, the original string pointer
> -   is returned.  */
> +   replaced by ENCODED.  */
>  
>  const char *
> -ada_decode (const char *encoded)
> +ada_decode (const char *encoded, gdb::unique_xmalloc_ptr<char> *storage)
>  {
>    int i, j;
>    int len0;
>    const char *p;
>    char *decoded;
>    int at_start_name;
> -  static char *decoding_buffer = NULL;
> -  static size_t decoding_buffer_size = 0;
> +  char *decoding_buffer = NULL;
> +  size_t decoding_buffer_size = 0;
>  
>    /* With function descriptors on PPC64, the value of a symbol named
>       ".FN", if it exists, is the entry point of the function "FN".  */
> @@ -1349,9 +1345,15 @@ ada_decode (const char *encoded)
>        goto Suppress;
>  
>    if (strcmp (decoded, encoded) == 0)
> -    return encoded;
> +    {
> +      xfree (decoding_buffer);
> +      return encoded;
> +    }
>    else
> -    return decoded;
> +    {
> +      storage->reset (decoded);
> +      return decoded;
> +    }
>  
>  Suppress:
>    GROW_VECT (decoding_buffer, decoding_buffer_size, strlen (encoded) + 3);
> @@ -1360,8 +1362,8 @@ Suppress:
>      strcpy (decoded, encoded);
>    else
>      xsnprintf (decoded, decoding_buffer_size, "<%s>", encoded);
> +  storage->reset (decoded);
>    return decoded;
> -
>  }
>  
>  /* Table for keeping permanent unique copies of decoded names.  Once
> @@ -1390,7 +1392,8 @@ ada_decode_symbol (const struct general_symbol_info *arg)
>  
>    if (!gsymbol->ada_mangled)
>      {
> -      const char *decoded = ada_decode (gsymbol->name);
> +      gdb::unique_xmalloc_ptr<char> storage;
> +      const char *decoded = ada_decode (gsymbol->name, &storage);
>        struct obstack *obstack = gsymbol->language_specific.obstack;
>  
>        gsymbol->ada_mangled = 1;
> @@ -1420,7 +1423,8 @@ ada_decode_symbol (const struct general_symbol_info *arg)
>  static char *
>  ada_la_decode (const char *encoded, int options)
>  {
> -  return xstrdup (ada_decode (encoded));
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  return xstrdup (ada_decode (encoded, &storage));
>  }
>  
>  /* Implement la_sniff_from_mangled_name for Ada.  */
> @@ -1428,7 +1432,8 @@ ada_la_decode (const char *encoded, int options)
>  static int
>  ada_sniff_from_mangled_name (const char *mangled, char **out)
>  {
> -  const char *demangled = ada_decode (mangled);
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  const char *demangled = ada_decode (mangled, &storage);
>  
>    *out = NULL;
>  
> @@ -6001,7 +6006,8 @@ is_name_suffix (const char *str)
>  static int
>  is_valid_name_for_wild_match (const char *name0)
>  {
> -  const char *decoded_name = ada_decode (name0);
> +  gdb::unique_xmalloc_ptr<char> storage;
> +  const char *decoded_name = ada_decode (name0, &storage);
>    int i;
>  
>    /* If the decoded name starts with an angle bracket, it means that
> @@ -6249,8 +6255,9 @@ ada_lookup_name_info::matches
>           is not a suitable completion.  */
>        const char *sym_name_copy = sym_name;
>        bool has_angle_bracket;
> +      gdb::unique_xmalloc_ptr<char> storage;
>  
> -      sym_name = ada_decode (sym_name);
> +      sym_name = ada_decode (sym_name, &storage);
>        has_angle_bracket = (sym_name[0] == '<');
>        match = (has_angle_bracket == m_verbatim_p);
>        sym_name = sym_name_copy;
> @@ -6272,12 +6279,14 @@ ada_lookup_name_info::matches
>  
>    /* Second: Try wild matching...  */
>  
> +  gdb::unique_xmalloc_ptr<char> sym_name_storage;
>    if (!match && m_wild_match_p)
>      {
>        /* Since we are doing wild matching, this means that TEXT
>           may represent an unqualified symbol name.  We therefore must
>           also compare TEXT against the unqualified name of the symbol.  */
> -      sym_name = ada_unqualified_name (ada_decode (sym_name));
> +      sym_name = ada_unqualified_name (ada_decode (sym_name,
> +						   &sym_name_storage));
>  
>        if (strncmp (sym_name, text, text_len) == 0)
>  	match = true;
> @@ -6293,7 +6302,10 @@ ada_lookup_name_info::matches
>        std::string &match_str = comp_match_res->match.storage ();
>  
>        if (!m_encoded_p)
> -	match_str = ada_decode (sym_name);
> +	{
> +	  gdb::unique_xmalloc_ptr<char> storage;
> +	  match_str = ada_decode (sym_name, &storage);
> +	}
>        else
>  	{
>  	  if (m_verbatim_p)
> @@ -13416,8 +13428,9 @@ ada_add_exceptions_from_frame (compiled_regex *preg,
>  static bool
>  name_matches_regex (const char *name, compiled_regex *preg)
>  {
> +  gdb::unique_xmalloc_ptr<char> storage;
>    return (preg == NULL
> -	  || preg->exec (ada_decode (name), 0, NULL, 0) == 0);
> +	  || preg->exec (ada_decode (name, &storage), 0, NULL, 0) == 0);
>  }
>  
>  /* Add all exceptions defined globally whose name name match
> @@ -13450,7 +13463,9 @@ ada_add_global_exceptions (compiled_regex *preg,
>  			   lookup_name_info::match_any (),
>  			   [&] (const char *search_name)
>  			   {
> -			     const char *decoded = ada_decode (search_name);
> +			     gdb::unique_xmalloc_ptr<char> storage;
> +			     const char *decoded = ada_decode (search_name,
> +							       &storage);
>  			     return name_matches_regex (decoded, preg);
>  			   },
>  			   NULL,
> diff --git a/gdb/ada-lang.h b/gdb/ada-lang.h
> index ff6c3399eaf..d0fc87c23a9 100644
> --- a/gdb/ada-lang.h
> +++ b/gdb/ada-lang.h
> @@ -227,7 +227,8 @@ extern struct type *ada_get_decoded_type (struct type *type);
>  
>  extern const char *ada_decode_symbol (const struct general_symbol_info *);
>  
> -extern const char *ada_decode (const char*);
> +extern const char *ada_decode (const char *,
> +			       gdb::unique_xmalloc_ptr<char> *storage);
>  
>  extern enum language ada_update_initial_language (enum language);
>  
> diff --git a/gdb/ada-varobj.c b/gdb/ada-varobj.c
> index a4d553d3786..29a4c86cc5f 100644
> --- a/gdb/ada-varobj.c
> +++ b/gdb/ada-varobj.c
> @@ -624,6 +624,7 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
>  	 of the array index type when such type qualification is
>  	 needed.  */
>        const char *index_type_name = NULL;
> +      gdb::unique_xmalloc_ptr<char> storage;
>  
>        /* If the index type is a range type, find the base type.  */
>        while (TYPE_CODE (index_type) == TYPE_CODE_RANGE)
> @@ -634,7 +635,7 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
>  	{
>  	  index_type_name = ada_type_name (index_type);
>  	  if (index_type_name)
> -	    index_type_name = ada_decode (index_type_name);
> +	    index_type_name = ada_decode (index_type_name, &storage);
>  	}
>  
>        if (index_type_name != NULL)
>