This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Compile elf/rtld.c with -fno-tree-loop-distribute-patterns.



On 25/11/2019 05:08, Florian Weimer wrote:
> * Sandra Loosemore:
> 
>>> Is it possible to do this via a pragma?  If I understand things
>>> correctly, this is not necessary for PI_STATIC_AND_HIDDEN targets
>>> (where initialization of the dynamic loader is simpler).
>>
>> I see that the definitions of memset and memmove use 
>> "inhibit_loop_to_libcall" (which expands into an optimize attribute) to 
>> prevent recursion, but I didn't think the header where that is defined 
>> (include/libc-symbols.h) is supposed to be included in the dynamic 
>> linker?
> 
> elf/rtld.os is built with -include ../include/libc-symbols.h, so the
> declaration should be in scope.  I don't know why it is not effective.
> It probably only applies to the implementations of memset and memmove
> themselves (if the generic ones written in C are used).
> 
>>  Also, already in elf/Makefile there is another instance where 
>> it adds -fno-tree-loop-distribute-patterns to the CFLAGS, so I just 
>> copied that.  I don't work with glibc internals enough to have a good 
>> feel for what the preferred solution is but I'll test a different 
>> solution if this one isn't good enough.
> 
> I had hoped we could write something like this at the start of
> elf/rtld.c:
> 
> #ifndef PI_STATIC_AND_HIDDEN
> # pragma GCC optimize ("no-tree-loop-distribute-patterns")
> #endif
> 
> Then the optimization would still be applied on the targets where it
> is safe to do so.
> 
> But I don't have a strong opinion about this and would appreciate
> feedback from others.
> 

We already have a similar code to handle a similar issue:

 484   /* Partly clean the `bootstrap_map' structure up.  Don't use
 485      `memset' since it might not be built in or inlined and we cannot
 486      make function calls at this point.  Use '__builtin_memset' if we
 487      know it is available.  We do not have to clear the memory if we
 488      do not have to use the temporary bootstrap_map.  Global variables
 489      are initialized to zero by default.  */
 490 #ifndef DONT_USE_BOOTSTRAP_MAP
 491 # ifdef HAVE_BUILTIN_MEMSET
 492   __builtin_memset (bootstrap_map.l_info, '\0', sizeof (bootstrap_map.l_info));
 493 # else
 494   for (size_t cnt = 0;
 495        cnt < sizeof (bootstrap_map.l_info) / sizeof (bootstrap_map.l_info[0]);
 496        ++cnt)
 497     bootstrap_map.l_info[cnt] = 0;
 498 # endif
 499 #endif

The HAVE_BUILTIN_MEMSET on configure.ac check if __builtin_memset itself
calls memset and it is within DONT_USE_BOOTSTRAP_MAP mainly because it is
stack allocated for !PI_STATIC_AND_HIDDEN.

(As a side-note, I really think these kind of micro-optimization is just
over-complicate for minimal gain)

However, I am not sure there is really a restriction regarding
PI_STATIC_AND_HIDDEN and internal function calls.  Just after this memset
call it has:

 516   if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])                  
 517     { 
 518       /* Relocate ourselves so we can do normal function calls and                               
 519          data access using the global offset table.  */                                          
 520       
 521       ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);                                            
 522     }
 523   bootstrap_map.l_relocated = 1; 

So it should be safe to call memcpy/memset calls in _dl_start after this
point (as some ports that with !PI_STATIC_AND_HIDDEN does with memcpy) .
Is gcc creating a mem* calls before ld.so reallocate itself? If it were, 
where exactly?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]