[PATCH] Fix gdb.multi/base.exp, gdb memory corruption

Doug Evans dje@google.com
Mon May 19 21:52:00 GMT 2014


On Mon, May 19, 2014 at 1:39 PM, Doug Evans <dje@google.com> wrote:
> Hi.
>
> My prune_inferiors patch
>
> 2014-05-17  Doug Evans  <xdje42@gmail.com>
>
>         * inferior.c (prune_inferiors): Fix comment.
>         (remove_inferior_command): Call prune_program_spaces.
>
> is causing gdb.multi/base.exp to fail:
>
> UNRESOLVED: gdb.multi/base.exp: remove-inferiors 2-3
> UNRESOLVED: gdb.multi/base.exp: check remove-inferiors
>
> gdb is crashing because it's accessing/freeing already freed memory.
> I was running the testsuite all weekend, sorry I only saw this now.
>
> E.g.
> ==16368== Invalid read of size 4
> ==16368==    at 0x660A9D: find_pc_section (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/objfiles.c:1349)
> ==16368==    by 0x663ECB: lookup_minimal_symbol_by_pc_section (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/minsyms.c:734)
> ==16368==    by 0x5D987A: find_pc_sect_symtab (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/symtab.c:2153)
> ==16368==    by 0x5D4D77: blockvector_for_pc_sect (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/block.c:168)
> ==16368==    by 0x5D4F59: block_for_pc_sect (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/block.c:246)
> ==16368==    by 0x5D4F9B: block_for_pc (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/block.c:258)
> ==16368==    by 0x734C5D: inline_frame_sniffer (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/inline-frame.c:218)
> ==16368==    by 0x732104: frame_unwind_try_unwinder (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/frame-unwind.c:108)
> ==16368==    by 0x73223F: frame_unwind_find_by_frame (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/frame-unwind.c:159)
> ==16368==    by 0x72D5AA: compute_frame_id (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/frame.c:453)
> ==16368==    by 0x7300EC: get_prev_frame_if_no_cycle (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/frame.c:1758)
> ==16368==    by 0x73079A: get_prev_frame_always (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/frame.c:1931)
> ==16368==  Address 0x5b13500 is 16 bytes inside a block of size 24 free'd
> ==16368==    at 0x403072E: free (valgrind/coregrind/m_replacemalloc/vg_replace_malloc.c:445)
> ==16368==    by 0x762134: xfree (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/common/common-utils.c:108)
> ==16368==    by 0x65DACF: objfiles_pspace_data_cleanup (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/objfiles.c:91)
> ==16368==    by 0x75E546: program_spaceregistry_callback_adaptor (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/progspace.c:45)
> ==16368==    by 0x7644F6: registry_clear_data (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/registry.c:82)
> ==16368==    by 0x7645AB: registry_container_free_data (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/registry.c:95)
> ==16368==    by 0x75E5B4: program_space_free_data (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/progspace.c:45)
> ==16368==    by 0x75E9BA: release_program_space (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/progspace.c:167)
> ==16368==    by 0x75EB9B: prune_program_spaces (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/progspace.c:269)
> ==16368==    by 0x75303D: remove_inferior_command (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/inferior.c:792)
> ==16368==    by 0x50B5FD: do_cfunc (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/cli/cli-decode.c:107)
> ==16368==    by 0x50E6F2: cmd_func (/usr/local/lib/debug/debug-src/obj64/gdb/../../binutils-gdb/gdb/cli/cli-decode.c:1886)
>
> The problem originates from the get_current_arch call in
> py-progspace.c:py_free_pspace.
> I think the comment in this patch explains things pretty well.
> Basically, we're in the pspace destructor, and thus there's not much we can
> rely on.  The Python machinery needs an arch, so we give it one,
> albeit a fiction.  I think(!) it doesn't matter.
> I'm going with this fix because it's not clear to me what The Right fix
> is, short of removing global state in gdb which is non-trivial.
> Since "there is always an inferior" calling target_gdbarch seems
> pretty safe here.
>
> 2014-05-19  Doug Evans  <dje@google.com>
>
>         * python/py-progspace.c (py_free_pspace): Use target_gdbarch instead
>         of get_current_arch.
>
> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
> index cda5a86..c787c69 100644
> --- a/gdb/python/py-progspace.c
> +++ b/gdb/python/py-progspace.c
> @@ -241,7 +241,16 @@ py_free_pspace (struct program_space *pspace, void *datum)
>  {
>    struct cleanup *cleanup;
>    pspace_object *object = datum;
> -  struct gdbarch *arch = get_current_arch ();
> +  /* This is a fiction, but we're in a nasty spot: The pspace is in the
> +     process of being deleted, we can't rely on anything in it.  Plus
> +     this is one time when the current program space and current inferior
> +     are not in sync.  The architecture comes from the inferior, which cannot
> +     be the current one because we wouldn't be deleting its pspace.
> +     We don't need to do much here so this fiction suffices.
> +     Note: We cannot call get_current_arch because it may try to access
> +     the target, which may involve accessing data in the pspace currently
> +     being deleted.  */
> +  struct gdbarch *arch = target_gdbarch ();
>
>    cleanup = ensure_python_env (arch, current_language);
>    object->pspace = NULL;

Hmmm...

I looked at py_free_inferior to see what it does:

---snip---
static void
py_free_inferior (struct inferior *inf, void *datum)
{

  struct cleanup *cleanup;
  inferior_object *inf_obj = datum;
  struct threadlist_entry *th_entry, *th_tmp;

  if (!gdb_python_initialized)
    return;

  cleanup = ensure_python_env (python_gdbarch, python_language);
---snip---

Just passing python_gdbarch back into ensure_python_env doesn't feel right.
One might try to get the arch from INF but it is being destructed and
we it would be preferable to not have to rely on it being available.
One solution would be to have a special "dummy" or some such arch for
use during these kinds of situations.

Plus I see python_on_normal_stop calling get_current_arch whereas
python_on_resume calling target_gdbarch.
There are no comments in the code explaining why things are the way
they are, thus it's hard to reason whether these choices have been
deliberately made or are essentially random.  It could be the case
that either choice is fine because it will never be used, but IWBN to
have that documented (or better yet just be consistent and document it
some place).

It feels like there's a good cleanup project here.
Am I missing something?



More information about the Gdb-patches mailing list