[PATCH RFC 5/5] Add DWARF index cache

Thu May 10 22:24:00 GMT 2018

>>>>> "Simon" == Simon Marchi <simon.marchi@ericsson.com> writes:

Simon> To help make using the DWARF index more transparent, this patch
Simon> introduces a DWARF index cache.  When enabled, loading an index-less
Simon> binary in GDB will automatically save an index file in ~/.cache/gdb.
Simon> When loading that same object file again, the index file will be looked
Simon> up and used to load the DWARF index.

Thanks for doing this.  I like this feature.

I read through the patch, but I didn't read the tests.  Some notes
below.

I'm sure you know, since it is an RFC, but the patch needs some more
comments here and there.

I have a wacky idea, too -- I'm testing a patch to move psymtabs out of
the objfile obstack and into a "self managed" object.  With this in
place, and because psymtabs are effectively immutable, it seems to me
that the index writing could be done in a background thread.  (I haven't
looked to see how the DWARF 5 index stuff works, I only really know
about .gdb_index, so this may not apply there.)

Simon> - The cache is just a directory with files named after the object
Simon>   file's build-id.  It is not possible to save/load the index for an
Simon>   object file without build-id in the cache.  Note that Clang does not
Simon>   add build-ids by default.

One thing that would be nice is also having some way to associate file
names with the build-ids.  Then gdb could notice when a build-id is
stale and unlink that cache entry.

For example you could have a symlink whose name is the some mangled form
of the objfile name, and whose target is the cache file.  Then if gdb
sees an objfile with that name but with a different link target, it can
remove the link target.  This might be a pain to implement in a non-racy
way, though.

I wonder if gdb should write out a short README into that directory when
it is first created.

Simon> - The size taken up by ~/.cache/gdb is not limited.  I was thinking we
Simon>   could add configurable limit (like ccache does), but that would come
Simon>   after.  Also, maybe a command to flush the cache.

Seems reasonable.

Simon> +/* A cheap (as in low-quality) recursive mkdir.  Try to create all the parents
Simon> +   directories up to DIR and DIR itself.  Stop if we hit an error along the way.
Simon> +   There is no attempt to remove created directories in case of failure.  */
Simon> +
Simon> +static void
Simon> +mkdir_recursive (const char *dir)

gnulib has a 'mkdir-p' module which could perhaps be used instead.

Simon> +index_cache_resource::~index_cache_resource () = default;

Simon> +struct index_cache_resource
Simon> +{
Simon> +  virtual ~index_cache_resource () = 0;
Simon> +};

I don't understand how to reconcile these two things - probably just
another bit of C++ knowledge I am missing.  Would you mind explaining
it?

Simon> +std::string
Simon> +index_cache::make_index_filename (const bfd_build_id *build_id,
Simon> +				  const char *suffix) const
Simon> +{
Simon> +  std::string build_id_str = build_id_to_string (build_id);
Simon> +
Simon> +  return string_printf ("%s%s%s%s", m_dir.c_str (), SLASH_STRING,
Simon> +			build_id_str.c_str (), suffix);

I think this particular one would be clearer just using "+".

Simon> +      /* Err...  I guess /tmp is better than nothing.  */
Simon> +      index_cache_directory = xstrdup ("/tmp");
Simon> +      global_index_cache.set_directory ("/tmp");

I tend to think nothing would be better.  Using /tmp means that gdb will
be reading files with (sometimes) predictable names that could be
created by anybody.  Better, IMO, to issue some sort of warning and
disable the feature.

Tom