This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Mach-O: Follow Apple's dSYM files


On Dec 26, 2011, at 6:17 PM, shinichiro hamaji wrote:

> Hi,
> 
> Here is a patch which reads debug information from .dSYM files in the
> same directory as the executable, dylib, or bundle binary:
> http://shinh.skr.jp/t/dsym.patch
> 
> To let find_line use debug BFD (in dSYM bundle) instead of the
> original executable's BFD, I used slightly different approach from
> gnu_debuglink. I factor out the part which reads debug info from
> find_line and use it from mach-o.c.
> 
> If we take an approach like gnu_debuglink, we may need to add a
> function pointer to bfd_target, say _bfd_follow_debug_bfd. I think
> this is kinda overkill considering this function just returns NULL for
> all other formats.
> 
> When I factor out _bfd_dwarf2_slurp_debug_info, I modified the order
> of operations in find_line. I hope this change is OK.
> 
> It seems Apple's GDB checks all files under
> "foo.dSYM/Contents/Resources/DWARF" directory for an executable "foo".
> This patch checks only "foo.dSYM/Contents/Resources/DWARF/foo" and
> leave this as a TODO. This is because 1) Apple's dsymutil (a linker
> for debug information) uses this filename by default and 2) dirent.h
> is not the standard header so I need a few autoconf changes. Even if
> we need it, I'd like to do this change in different patch as this
> patch is already getting large.
> 
> I ran "objdump -S" for bunch of mach-o/elf files and "make check". The
> results look OK. I'm not 100% sure if my decisions are the best. Any
> kind of suggestions will be really appreciated.

I think the approach is fine.  I can't approve the change in dwarf2.c, but it is mechanical.

Enclosed are minor comments.

> 
> bfd/
> 2011-12-27  Shinichiro Hamaji  <shinichiro.hamaji@gmail.com>
> 
> 	* dwarf2.c (_bfd_dwarf2_slurp_debug_info): Factor out the part
> 	which reads DWARF2 and stores in stash from find_line.
> 	(find_line) Call _bfd_dwarf2_slurp_debug_info.
> 	* libbfd-in.h (_bfd_dwarf2_slurp_debug_info): Add declaration.
> 	* libbfd.h (_bfd_dwarf2_slurp_debug_info): Regenerate.
> 	* mach-o.c (dsym_subdir): The name of subdir where debug
> 	information may be stored.
> 	(dsym_subdir_len): The length of dsym_subdir.
> 	(bfd_mach_o_lookup_uuid_command): New. Lookup a load command whose
> 	type is UUID.
> 	(bfd_mach_o_dsym_p): New. Check if the specified BFD is
> 	corresponding to the executable.
> 	(bfd_mach_o_find_macosx_dsym_in_fat): New. Find a debug information
> 	BFD in a FAT binary.
> 	(bfd_mach_o_find_macosx_dsym): New. Find a debug information BFD in
> 	the specified binary file.
> 	(bfd_mach_o_follow_macosx_dsym): New. Find a debug information BFD
> 	for the original BFD.
> 	(bfd_mach_o_find_nearest_line): Check dSYM files for Mach-O
> 	executables, dylibs, and bundles.
> 	(bfd_mach_o_close_and_cleanup): Clean up BFDs for the dSYM file.
> 	* mach-o.h (debug_filename): The filename of the dSYM file.
> 	(debug_bfd): The BFD of the dSYM file.
> 	(debug_fat_bfd): The BFD of the fat binary containing debug_bfd.

[…]

> diff --git a/bfd/mach-o.c b/bfd/mach-o.c
> index cc68d89..f46f94d 100644
> --- a/bfd/mach-o.c
> +++ b/bfd/mach-o.c
> @@ -277,6 +277,9 @@ static const mach_o_segment_name_xlat segsec_names_xlat[] =
>     { NULL, NULL }
>   };
> 
> +static const char dsym_subdir[] = ".dSYM/Contents/Resources/DWARF";
> +static const int dsym_subdir_len = sizeof(dsym_subdir);

I am not fan of defining dsym_subdir_len.

> +
> /* For both cases bfd-name => mach-o name and vice versa, the specific target
>    is checked before the generic.  This allows a target (e.g. ppc for cstring)
>    to override the generic definition with a more specific one.  */
> @@ -3738,6 +3741,152 @@ bfd_mach_o_core_file_failing_signal (bfd *abfd
> ATTRIBUTE_UNUSED)
>   return 0;
> }
> 
> +static bfd_mach_o_uuid_command *
> +bfd_mach_o_lookup_uuid_command (bfd *abfd)
> +{
> +  bfd_mach_o_load_command *uuid_cmd;
> +  int ncmd = bfd_mach_o_lookup_command (abfd, BFD_MACH_O_LC_UUID, &uuid_cmd);
> +  if (ncmd != 1)
> +    return FALSE;
> +  return &uuid_cmd->command.uuid;
> +}
> +
> +static bfd_boolean
> +bfd_mach_o_dsym_p (bfd *debug_bfd, bfd_mach_o_uuid_command *uuid_cmd)

Could you rename it to bfd_mach_o_dsym_for_uuid_p ?  Also, a comment before the definition is always welcome!


> +{
> +  bfd_mach_o_uuid_command *debug_uuid_cmd;
> +
> +  BFD_ASSERT (debug_bfd);
> +  BFD_ASSERT (uuid_cmd);
> +
> +  if (!bfd_check_format (debug_bfd, bfd_object))
> +    return FALSE;
> +
> +  if (bfd_get_flavour (debug_bfd) != bfd_target_mach_o_flavour)
> +    return FALSE;

I think you should also check for DSYM filetype.

> +
> +  debug_uuid_cmd = bfd_mach_o_lookup_uuid_command (debug_bfd);
> +  if (debug_uuid_cmd == NULL)
> +    return FALSE;
> +
> +  if (memcmp (uuid_cmd->uuid, debug_uuid_cmd->uuid,
> +              sizeof (uuid_cmd->uuid)) != 0)
> +    return FALSE;
> +
> +  return TRUE;
> +}
> +
> +static bfd *
> +bfd_mach_o_find_macosx_dsym_in_fat (bfd *fat_bfd,
> +                                    bfd_mach_o_uuid_command *uuid_cmd)

I think you should omit the _macosx_ part of the name.
Also, you may find bfd_mac_o_fat_extract useful.

> +{
> +  bfd *debug_bfd = NULL, *last_bfd = NULL;
> +
> +  BFD_ASSERT (fat_bfd);
> +  BFD_ASSERT (uuid_cmd);
> +
> +  for (;;)
> +    {
> +      debug_bfd = bfd_mach_o_openr_next_archived_file (fat_bfd, debug_bfd);
> +      if (debug_bfd == NULL)
> +        break;
> +
> +      if (bfd_mach_o_dsym_p (debug_bfd, uuid_cmd))
> +        break;
> +
> +      if (last_bfd != NULL)
> +        bfd_close (last_bfd);
> +      last_bfd = debug_bfd;
> +    }
> +
> +  if (last_bfd != NULL)
> +    bfd_close (last_bfd);
> +  return debug_bfd;
> +}
> +
> +static bfd *
> +bfd_mach_o_find_macosx_dsym (bfd *abfd,
> +                             bfd_mach_o_uuid_command *uuid_cmd,
> +                             char *debug_filename)

Ditto for _macosx_ in function name.
A comment would be very welcome (with a short description for each argument).

> +{
> +  bfd *debug_bfd;
> +  bfd_mach_o_data_struct *mdata;
> +
> +  BFD_ASSERT (abfd);
> +  BFD_ASSERT (uuid_cmd);
> +
> +  debug_bfd = bfd_openr (debug_filename, NULL);
> +  if (debug_bfd == NULL)
> +    return NULL;
> +
> +  mdata = bfd_mach_o_get_data (abfd);
> +

That's where bfd_mach_o_fat_extract has to be used (IMHO).

> +  if (bfd_check_format (debug_bfd, bfd_archive))
> +    {
> +      bfd *r = bfd_mach_o_find_macosx_dsym_in_fat (debug_bfd, uuid_cmd);
> +      if (r)
> +        {
> +          mdata->debug_filename = debug_filename;
> +          mdata->debug_bfd = r;
> +          mdata->debug_fat_bfd = debug_bfd;
> +        }
> +      return r;
> +    }
> +
> +  if (bfd_mach_o_dsym_p (debug_bfd, uuid_cmd))
> +    {
> +      mdata->debug_filename = debug_filename;
> +      mdata->debug_bfd = debug_bfd;
> +      return debug_bfd;
> +    }

It would be nice not to store debug_filename and debug_bfd here, but in a caller.  That would make it easier to reuse bfd_mach_o_follow_dsym by gdb.

> +
> +  bfd_close (debug_bfd);
> +
> +  return NULL;
> +}
> +
> +static bfd *
> +bfd_mach_o_follow_macosx_dsym (bfd *abfd)
> +{

Ditto for _macosx_ and the comment.

> +  char *debug_filename;
> +  bfd_mach_o_uuid_command *uuid_cmd;
> +  bfd *debug_bfd, *base_bfd = abfd;
> +  const char *base_basename;
> +
> +  if (abfd == NULL || bfd_get_flavour (abfd) != bfd_target_mach_o_flavour)
> +    return NULL;
> +
> +  if (abfd->my_archive)
> +    base_bfd = abfd->my_archive;
> +  /* BFD may have been opened from a stream. */
> +  if (base_bfd->filename == NULL)
> +    {
> +      bfd_set_error (bfd_error_invalid_operation);
> +      return NULL;
> +    }
> +  base_basename = lbasename (base_bfd->filename);
> +
> +  uuid_cmd = bfd_mach_o_lookup_uuid_command (abfd);
> +  if (uuid_cmd == NULL)
> +    return NULL;
> +
> +  /* TODO: We assume the DWARF file has the same as the binary's.
> +     It seems apple's GDB checks all files in the dSYM bundle directory.
> +     http://opensource.apple.com/source/gdb/gdb-1708/src/gdb/macosx/macosx-tdep.c
> +  */
> +  debug_filename = (char *)bfd_malloc (strlen (base_bfd->filename)
> +                                       + dsym_subdir_len + 1
> +                                       + strlen (base_basename) + 1);
> +  sprintf (debug_filename, "%s%s/%s",
> +           base_bfd->filename, dsym_subdir, base_basename);
> +
> +  debug_bfd = bfd_mach_o_find_macosx_dsym (abfd, uuid_cmd, debug_filename);
> +  if (debug_bfd == NULL)
> +    free (debug_filename);
> +
> +  return debug_bfd;
> +}
> +
> bfd_boolean
> bfd_mach_o_find_nearest_line (bfd *abfd,
> 			      asection *section,
> @@ -3748,9 +3897,30 @@ bfd_mach_o_find_nearest_line (bfd *abfd,
> 			      unsigned int *line_ptr)
> {
>   bfd_mach_o_data_struct *mdata = bfd_mach_o_get_data (abfd);
> -  /* TODO: Handle executables and dylibs by using dSYMs. */
> -  if (mdata->header.filetype != BFD_MACH_O_MH_OBJECT)
> +  if (mdata == NULL)
>     return FALSE;
> +  switch (mdata->header.filetype)
> +    {
> +    case BFD_MACH_O_MH_OBJECT:
> +      break;
> +    case BFD_MACH_O_MH_EXECUTE:
> +    case BFD_MACH_O_MH_DYLIB:
> +    case BFD_MACH_O_MH_BUNDLE:
> +    case BFD_MACH_O_MH_KEXT_BUNDLE:
> +      if (mdata->dwarf2_find_line_info == NULL)
> +        {
> +          bfd *debug_bfd = bfd_mach_o_follow_macosx_dsym (abfd);
> +          if (! debug_bfd)
> +            return FALSE;
> +          if (! _bfd_dwarf2_slurp_debug_info (abfd, debug_bfd,
> +                                              dwarf_debug_sections, symbols,
> +                                              &mdata->dwarf2_find_line_info))
> +            return FALSE;
> +        }
> +      break;
> +    default:
> +      return FALSE;
> +    }
>   if (_bfd_dwarf2_find_nearest_line (abfd, dwarf_debug_sections,
> 				     section, symbols, offset,
> 				     filename_ptr, functionname_ptr,
> @@ -3768,6 +3938,21 @@ bfd_mach_o_close_and_cleanup (bfd *abfd)
>     {
>       _bfd_dwarf2_cleanup_debug_info (abfd, &mdata->dwarf2_find_line_info);
>       bfd_mach_o_free_cached_info (abfd);
> +      if (mdata->debug_bfd != NULL)
> +        {
> +          bfd_close (mdata->debug_bfd);
> +          mdata->debug_bfd = NULL;
> +        }
> +      if (mdata->debug_fat_bfd != NULL)
> +        {
> +          bfd_close (mdata->debug_fat_bfd);
> +          mdata->debug_fat_bfd = NULL;
> +        }
> +      if (mdata->debug_filename != NULL)
> +        {
> +          free (mdata->debug_filename);
> +          mdata->debug_filename = NULL;
> +        }
>     }
> 
>   return _bfd_generic_close_and_cleanup (abfd);
> diff --git a/bfd/mach-o.h b/bfd/mach-o.h
> index 89dce1a..c94dd55 100644
> --- a/bfd/mach-o.h
> +++ b/bfd/mach-o.h
> @@ -520,6 +520,13 @@ typedef struct mach_o_data_struct
>   /* A place to stash dwarf2 info for this bfd.  */
>   void *dwarf2_find_line_info;
> 
> +  /* Filename of .dSYM file. */
> +  char *debug_filename;
> +  /* BFD of .dSYM file. */
> +  bfd *debug_bfd;
> +  /* BFD of a fat binary which contains debug_bfd. */
> +  bfd *debug_fat_bfd;

I'd prefer to use the 'dsym_' prefix instead of 'debug_'.
There is no need to save debug_filename, just free it when the BFD is closed.
Same for  debug_fat_bfd.

> +
>   /* Cache of dynamic relocs. */
>   arelent *dyn_reloc_cache;
> }

Thank you for this patch,
Tristan.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]