[PATCH] Mach-O: Follow Apple's dSYM files
Tristan Gingold
gingold@adacore.com
Mon Jan 2 10:07:00 GMT 2012
On Dec 26, 2011, at 6:17 PM, shinichiro hamaji wrote:
> Hi,
>
> Here is a patch which reads debug information from .dSYM files in the
> same directory as the executable, dylib, or bundle binary:
> http://shinh.skr.jp/t/dsym.patch
>
> To let find_line use debug BFD (in dSYM bundle) instead of the
> original executable's BFD, I used slightly different approach from
> gnu_debuglink. I factor out the part which reads debug info from
> find_line and use it from mach-o.c.
>
> If we take an approach like gnu_debuglink, we may need to add a
> function pointer to bfd_target, say _bfd_follow_debug_bfd. I think
> this is kinda overkill considering this function just returns NULL for
> all other formats.
>
> When I factor out _bfd_dwarf2_slurp_debug_info, I modified the order
> of operations in find_line. I hope this change is OK.
>
> It seems Apple's GDB checks all files under
> "foo.dSYM/Contents/Resources/DWARF" directory for an executable "foo".
> This patch checks only "foo.dSYM/Contents/Resources/DWARF/foo" and
> leave this as a TODO. This is because 1) Apple's dsymutil (a linker
> for debug information) uses this filename by default and 2) dirent.h
> is not the standard header so I need a few autoconf changes. Even if
> we need it, I'd like to do this change in different patch as this
> patch is already getting large.
>
> I ran "objdump -S" for bunch of mach-o/elf files and "make check". The
> results look OK. I'm not 100% sure if my decisions are the best. Any
> kind of suggestions will be really appreciated.
I think the approach is fine. I can't approve the change in dwarf2.c, but it is mechanical.
Enclosed are minor comments.
>
> bfd/
> 2011-12-27 Shinichiro Hamaji <shinichiro.hamaji@gmail.com>
>
> * dwarf2.c (_bfd_dwarf2_slurp_debug_info): Factor out the part
> which reads DWARF2 and stores in stash from find_line.
> (find_line) Call _bfd_dwarf2_slurp_debug_info.
> * libbfd-in.h (_bfd_dwarf2_slurp_debug_info): Add declaration.
> * libbfd.h (_bfd_dwarf2_slurp_debug_info): Regenerate.
> * mach-o.c (dsym_subdir): The name of subdir where debug
> information may be stored.
> (dsym_subdir_len): The length of dsym_subdir.
> (bfd_mach_o_lookup_uuid_command): New. Lookup a load command whose
> type is UUID.
> (bfd_mach_o_dsym_p): New. Check if the specified BFD is
> corresponding to the executable.
> (bfd_mach_o_find_macosx_dsym_in_fat): New. Find a debug information
> BFD in a FAT binary.
> (bfd_mach_o_find_macosx_dsym): New. Find a debug information BFD in
> the specified binary file.
> (bfd_mach_o_follow_macosx_dsym): New. Find a debug information BFD
> for the original BFD.
> (bfd_mach_o_find_nearest_line): Check dSYM files for Mach-O
> executables, dylibs, and bundles.
> (bfd_mach_o_close_and_cleanup): Clean up BFDs for the dSYM file.
> * mach-o.h (debug_filename): The filename of the dSYM file.
> (debug_bfd): The BFD of the dSYM file.
> (debug_fat_bfd): The BFD of the fat binary containing debug_bfd.
[…]
> diff --git a/bfd/mach-o.c b/bfd/mach-o.c
> index cc68d89..f46f94d 100644
> --- a/bfd/mach-o.c
> +++ b/bfd/mach-o.c
> @@ -277,6 +277,9 @@ static const mach_o_segment_name_xlat segsec_names_xlat[] =
> { NULL, NULL }
> };
>
> +static const char dsym_subdir[] = ".dSYM/Contents/Resources/DWARF";
> +static const int dsym_subdir_len = sizeof(dsym_subdir);
I am not fan of defining dsym_subdir_len.
> +
> /* For both cases bfd-name => mach-o name and vice versa, the specific target
> is checked before the generic. This allows a target (e.g. ppc for cstring)
> to override the generic definition with a more specific one. */
> @@ -3738,6 +3741,152 @@ bfd_mach_o_core_file_failing_signal (bfd *abfd
> ATTRIBUTE_UNUSED)
> return 0;
> }
>
> +static bfd_mach_o_uuid_command *
> +bfd_mach_o_lookup_uuid_command (bfd *abfd)
> +{
> + bfd_mach_o_load_command *uuid_cmd;
> + int ncmd = bfd_mach_o_lookup_command (abfd, BFD_MACH_O_LC_UUID, &uuid_cmd);
> + if (ncmd != 1)
> + return FALSE;
> + return &uuid_cmd->command.uuid;
> +}
> +
> +static bfd_boolean
> +bfd_mach_o_dsym_p (bfd *debug_bfd, bfd_mach_o_uuid_command *uuid_cmd)
Could you rename it to bfd_mach_o_dsym_for_uuid_p ? Also, a comment before the definition is always welcome!
> +{
> + bfd_mach_o_uuid_command *debug_uuid_cmd;
> +
> + BFD_ASSERT (debug_bfd);
> + BFD_ASSERT (uuid_cmd);
> +
> + if (!bfd_check_format (debug_bfd, bfd_object))
> + return FALSE;
> +
> + if (bfd_get_flavour (debug_bfd) != bfd_target_mach_o_flavour)
> + return FALSE;
I think you should also check for DSYM filetype.
> +
> + debug_uuid_cmd = bfd_mach_o_lookup_uuid_command (debug_bfd);
> + if (debug_uuid_cmd == NULL)
> + return FALSE;
> +
> + if (memcmp (uuid_cmd->uuid, debug_uuid_cmd->uuid,
> + sizeof (uuid_cmd->uuid)) != 0)
> + return FALSE;
> +
> + return TRUE;
> +}
> +
> +static bfd *
> +bfd_mach_o_find_macosx_dsym_in_fat (bfd *fat_bfd,
> + bfd_mach_o_uuid_command *uuid_cmd)
I think you should omit the _macosx_ part of the name.
Also, you may find bfd_mac_o_fat_extract useful.
> +{
> + bfd *debug_bfd = NULL, *last_bfd = NULL;
> +
> + BFD_ASSERT (fat_bfd);
> + BFD_ASSERT (uuid_cmd);
> +
> + for (;;)
> + {
> + debug_bfd = bfd_mach_o_openr_next_archived_file (fat_bfd, debug_bfd);
> + if (debug_bfd == NULL)
> + break;
> +
> + if (bfd_mach_o_dsym_p (debug_bfd, uuid_cmd))
> + break;
> +
> + if (last_bfd != NULL)
> + bfd_close (last_bfd);
> + last_bfd = debug_bfd;
> + }
> +
> + if (last_bfd != NULL)
> + bfd_close (last_bfd);
> + return debug_bfd;
> +}
> +
> +static bfd *
> +bfd_mach_o_find_macosx_dsym (bfd *abfd,
> + bfd_mach_o_uuid_command *uuid_cmd,
> + char *debug_filename)
Ditto for _macosx_ in function name.
A comment would be very welcome (with a short description for each argument).
> +{
> + bfd *debug_bfd;
> + bfd_mach_o_data_struct *mdata;
> +
> + BFD_ASSERT (abfd);
> + BFD_ASSERT (uuid_cmd);
> +
> + debug_bfd = bfd_openr (debug_filename, NULL);
> + if (debug_bfd == NULL)
> + return NULL;
> +
> + mdata = bfd_mach_o_get_data (abfd);
> +
That's where bfd_mach_o_fat_extract has to be used (IMHO).
> + if (bfd_check_format (debug_bfd, bfd_archive))
> + {
> + bfd *r = bfd_mach_o_find_macosx_dsym_in_fat (debug_bfd, uuid_cmd);
> + if (r)
> + {
> + mdata->debug_filename = debug_filename;
> + mdata->debug_bfd = r;
> + mdata->debug_fat_bfd = debug_bfd;
> + }
> + return r;
> + }
> +
> + if (bfd_mach_o_dsym_p (debug_bfd, uuid_cmd))
> + {
> + mdata->debug_filename = debug_filename;
> + mdata->debug_bfd = debug_bfd;
> + return debug_bfd;
> + }
It would be nice not to store debug_filename and debug_bfd here, but in a caller. That would make it easier to reuse bfd_mach_o_follow_dsym by gdb.
> +
> + bfd_close (debug_bfd);
> +
> + return NULL;
> +}
> +
> +static bfd *
> +bfd_mach_o_follow_macosx_dsym (bfd *abfd)
> +{
Ditto for _macosx_ and the comment.
> + char *debug_filename;
> + bfd_mach_o_uuid_command *uuid_cmd;
> + bfd *debug_bfd, *base_bfd = abfd;
> + const char *base_basename;
> +
> + if (abfd == NULL || bfd_get_flavour (abfd) != bfd_target_mach_o_flavour)
> + return NULL;
> +
> + if (abfd->my_archive)
> + base_bfd = abfd->my_archive;
> + /* BFD may have been opened from a stream. */
> + if (base_bfd->filename == NULL)
> + {
> + bfd_set_error (bfd_error_invalid_operation);
> + return NULL;
> + }
> + base_basename = lbasename (base_bfd->filename);
> +
> + uuid_cmd = bfd_mach_o_lookup_uuid_command (abfd);
> + if (uuid_cmd == NULL)
> + return NULL;
> +
> + /* TODO: We assume the DWARF file has the same as the binary's.
> + It seems apple's GDB checks all files in the dSYM bundle directory.
> + http://opensource.apple.com/source/gdb/gdb-1708/src/gdb/macosx/macosx-tdep.c
> + */
> + debug_filename = (char *)bfd_malloc (strlen (base_bfd->filename)
> + + dsym_subdir_len + 1
> + + strlen (base_basename) + 1);
> + sprintf (debug_filename, "%s%s/%s",
> + base_bfd->filename, dsym_subdir, base_basename);
> +
> + debug_bfd = bfd_mach_o_find_macosx_dsym (abfd, uuid_cmd, debug_filename);
> + if (debug_bfd == NULL)
> + free (debug_filename);
> +
> + return debug_bfd;
> +}
> +
> bfd_boolean
> bfd_mach_o_find_nearest_line (bfd *abfd,
> asection *section,
> @@ -3748,9 +3897,30 @@ bfd_mach_o_find_nearest_line (bfd *abfd,
> unsigned int *line_ptr)
> {
> bfd_mach_o_data_struct *mdata = bfd_mach_o_get_data (abfd);
> - /* TODO: Handle executables and dylibs by using dSYMs. */
> - if (mdata->header.filetype != BFD_MACH_O_MH_OBJECT)
> + if (mdata == NULL)
> return FALSE;
> + switch (mdata->header.filetype)
> + {
> + case BFD_MACH_O_MH_OBJECT:
> + break;
> + case BFD_MACH_O_MH_EXECUTE:
> + case BFD_MACH_O_MH_DYLIB:
> + case BFD_MACH_O_MH_BUNDLE:
> + case BFD_MACH_O_MH_KEXT_BUNDLE:
> + if (mdata->dwarf2_find_line_info == NULL)
> + {
> + bfd *debug_bfd = bfd_mach_o_follow_macosx_dsym (abfd);
> + if (! debug_bfd)
> + return FALSE;
> + if (! _bfd_dwarf2_slurp_debug_info (abfd, debug_bfd,
> + dwarf_debug_sections, symbols,
> + &mdata->dwarf2_find_line_info))
> + return FALSE;
> + }
> + break;
> + default:
> + return FALSE;
> + }
> if (_bfd_dwarf2_find_nearest_line (abfd, dwarf_debug_sections,
> section, symbols, offset,
> filename_ptr, functionname_ptr,
> @@ -3768,6 +3938,21 @@ bfd_mach_o_close_and_cleanup (bfd *abfd)
> {
> _bfd_dwarf2_cleanup_debug_info (abfd, &mdata->dwarf2_find_line_info);
> bfd_mach_o_free_cached_info (abfd);
> + if (mdata->debug_bfd != NULL)
> + {
> + bfd_close (mdata->debug_bfd);
> + mdata->debug_bfd = NULL;
> + }
> + if (mdata->debug_fat_bfd != NULL)
> + {
> + bfd_close (mdata->debug_fat_bfd);
> + mdata->debug_fat_bfd = NULL;
> + }
> + if (mdata->debug_filename != NULL)
> + {
> + free (mdata->debug_filename);
> + mdata->debug_filename = NULL;
> + }
> }
>
> return _bfd_generic_close_and_cleanup (abfd);
> diff --git a/bfd/mach-o.h b/bfd/mach-o.h
> index 89dce1a..c94dd55 100644
> --- a/bfd/mach-o.h
> +++ b/bfd/mach-o.h
> @@ -520,6 +520,13 @@ typedef struct mach_o_data_struct
> /* A place to stash dwarf2 info for this bfd. */
> void *dwarf2_find_line_info;
>
> + /* Filename of .dSYM file. */
> + char *debug_filename;
> + /* BFD of .dSYM file. */
> + bfd *debug_bfd;
> + /* BFD of a fat binary which contains debug_bfd. */
> + bfd *debug_fat_bfd;
I'd prefer to use the 'dsym_' prefix instead of 'debug_'.
There is no need to save debug_filename, just free it when the BFD is closed.
Same for debug_fat_bfd.
> +
> /* Cache of dynamic relocs. */
> arelent *dyn_reloc_cache;
> }
Thank you for this patch,
Tristan.
More information about the Binutils
mailing list