Does the LD --wrap feature work for library internal references?

Sebastian Huber sebastian.huber@embedded-brains.de
Fri Jan 25 13:55:00 GMT 2019


On 25/01/2019 09:37, Sebastian Huber wrote:
>
>
> On 23/01/2019 10:11, Sebastian Huber wrote:
>> On 18/01/2019 14:41, Alan Modra wrote:
>>> On Fri, Jan 18, 2019 at 10:04:36AM +0100, Sebastian Huber wrote:
>>>> On 18/01/2019 01:24, Alan Modra wrote:
>>>>> No, -ffunction-sections will make no difference.  Really, --wrap was
>>>>> intended for wrapping system functions, which I guess is why the
>>>>> feature was implemented only for undefined symbols.  I don't see a
>>>>> fundamental reason why --wrap couldn't be made to work with defined
>>>>> symbols, provided the compiler and assembler don't optimise 
>>>>> references
>>>>> to local functions.
>>>> In case it is acceptable to extend --wrap to work also for defined 
>>>> symbol
>>>> references, then I would have a look at this and try to implement 
>>>> it. If you
>>>> already have an idea which functions needs to be touched for this new
>>>> feature, then this would be helpful for me.
>>> It might just be a matter of calling bfd_wrapped_link_hash_lookup in
>>> more places where we currently call bfd_link_hash_lookup and its
>>> derivatives like elf_link_hash_lookup.  Knowing which places to change
>>> is the difficult part.  Anything involved with relocation processing
>>> for a start.
>>>
>>
>> I think to wrap defined references you have to add things to other 
>> places than spots which call bfd_link_hash_lookup() or 
>> bfd_wrapped_link_hash_lookup(). I built ld with -O0 -g and used the 
>> following test program:
>>
>> void f(void)
>> {
>> }
>>
>> void g(void);
>>
>> void _start(void)
>> {
>>     f();
>>     g();
>> }
>>
>> I use this GDB script:
>>
>> b bfd_wrapped_link_hash_lookup  if string[0] != '.' && (string[0] == 
>> 'f' || string[7] == 'f' || string[0] == 'g' || string[7] == 'g')
>> commands
>> c
>> end
>> b bfd_link_hash_lookup if string[0] != '.' && (string[0] == 'f' || 
>> string[7] == 'f' || string[0] == 'g' || string[7] == 'g')
>> commands
>> c
>> end
>> r
>>
>> This yields:
>>
>> gdb --command=main.gdb --args ld-new -wrap=f -wrap=g main.o -o main.exe
>> Breakpoint 2, bfd_link_hash_lookup (table=0x8df4f0, string=0x8f2bd0 
>> "f", create=1, copy=0, follow=0) at ../../bfd/linker.c:511
>> 511       if (table == NULL || string == NULL)
>>
>> Breakpoint 2, bfd_link_hash_lookup (table=0x8df4f0, string=0x8f2bd2 
>> "_start", create=1, copy=0, follow=0) at ../../bfd/linker.c:511
>> 511       if (table == NULL || string == NULL)
>>
>> Breakpoint 1, bfd_wrapped_link_hash_lookup (abfd=0x8eec40, 
>> info=0x8c2de0 <link_info>, string=0x8f2bd9 "g", create=1, copy=0, 
>> follow=0) at ../../bfd/linker.c:541
>> 541       if (info->wrap_hash != NULL)
>>
>> Breakpoint 2, bfd_link_hash_lookup (table=0x8df4f0, string=0x8f35c0 
>> "__wrap_g", create=1, copy=1, follow=0) at ../../bfd/linker.c:511
>> 511       if (table == NULL || string == NULL)
>> /home/EB/sebastian_h/archive/binutils-git/build/ld/ld-new: main.o: in 
>> function `_start':
>> main.c:(.text+0x11): undefined reference to `__wrap_g'
>>
>> The lookup is performed only once for "f". I think this is when ld 
>> finds the definition of f(). For the call in _start() we end up in 
>> other areas of ld.
>>
>
> Wrapping of undefined symbol references are dealt with during 
> load_symbols()
> which is performed early during the link process.
>
> For defined symbols references we have to look at the relocations. 
> During the
> final link performed by bfd_elf_final_link() which calls 
> elf_link_input_bfd()
> which calls the architecture-specific elf_x86_64_relocate_section() we 
> end up
> in (elf-bfd.h):
>
> /* This macro is to avoid lots of duplicated code in the body
>    of xxx_relocate_section() in the various elfxx-xxxx.c files. */
> #define RELOC_FOR_GLOBAL_SYMBOL(info, input_bfd, input_section, rel,    \
>                 r_symndx, symtab_hdr, sym_hashes,    \
>                 h, sec, relocation,            \
>                 unresolved_reloc, warned, ignored)    \
>   do                                    \
>     {                                    \
>       /* It seems this can happen with erroneous or unsupported     \
>      input (mixing a.out and elf in an archive, for example.)  */ \
>       if (sym_hashes == NULL)                        \
>     return FALSE;                            \
>                                     \
>       h = sym_hashes[r_symndx - symtab_hdr->sh_info]; \
>                                     \
>       if (info->wrap_hash != NULL                    \
>       && (input_section->flags & SEC_DEBUGGING) != 0)        \
>     h = ((struct elf_link_hash_entry *)                \
>          unwrap_hash_lookup (info, input_bfd, &h->root));     \
>
> Why is there a special case for input_section->flags & SEC_DEBUGGING) 
> != 0 here?
>
> If I extend this macro in case info->wrap_hash != NULL to do the 
> symbol wrapping, then there is an issue with the __real_SYMBOL 
> references which are no longer visible here. We only see SYMBOL and 
> __wrap_SYMBOL. The SYMBOL could be a defined reference or a former 
> __real_SYMBOL undefined reference.
>
> Would it be feasible to the the wrapping at the relocation step and 
> not during load_symbols()?
>

With this quick and dirty hack I can wrap at relocation level. There is 
a NULL pointer access in case of unresolved references.

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber@embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-HACK-Do-LD-wrap-at-relocation-level.patch
Type: text/x-patch
Size: 5833 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20190125/c862e072/attachment.bin>


More information about the Binutils mailing list