[PATCH v3] elf: Make more functions available for binding during dlclose (bug 30425)

Florian Weimer fweimer@redhat.com
Tue May 30 13:41:04 GMT 2023


* Szabolcs Nagy:

> The 05/30/2023 11:44, Florian Weimer via Libc-alpha wrote:
>> diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
>> index 05f36a2507..a8f48fed12 100644
>> --- a/elf/dl-lookup.c
>> +++ b/elf/dl-lookup.c
>> @@ -366,8 +366,25 @@ do_lookup_x (const char *undef_name, unsigned int new_hash,
>>        if ((type_class & ELF_RTYPE_CLASS_COPY) && map->l_type == lt_executable)
>>  	continue;
>>  
>> -      /* Do not look into objects which are going to be removed.  */
>> -      if (map->l_removed)
>> +      /* Do not look into objects which are going to be removed,
>> +	 except when the referencing object itself is being removed.
>> +
>> +	 The second part covers the situation when an object lazily
>> +	 binds to another object while running its destructor, but the
>> +	 destructor of the other object has already run, so that
>> +	 dlclose has set l_removed.  It may not always be obvious how
>> +	 to avoid such a scenario to programmers creating DSOs,
>> +	 particularly if C++ vague linkage is involved and triggers
>> +	 symbol interposition.
>> +
>> +	 Accepting these to-be-removed objects makes the lazy and
>> +	 BIND_NOW cases more similar.  (With BIND_NOW, the symbol is
>> +	 resolved early, before the destructor call, so the issue does
>> +	 not arise.).  Behavior matches the constructor scenario: the
>> +	 implementation allows binding to symbols of objects whose
>> +	 constructors have not run.  In fact, not doing this would be
>> +	 mostly incompatible with symbol interposition.  */
>> +      if (map->l_removed && !(undef_map != NULL && undef_map->l_removed))
>>  	continue;
>
> btw is there a valid use-case that goes wrong if the check is
> dropped completely? (keep binding to map when map->l_removed)

I think something like this is needed for useful diagnostics in
multi-threaded programs, where another thread mind bind lazily to an
object that is under removal.  Usually, we'd record a relocation
dependency to prevent removal, but we can't do that once dlclose has
started for real.  So without the l_removed check, we proceed to bind
the symbol, and crash during a later call.  With the check and a
non-weak symbol, we terminate the process with an error message naming
the symbol at least.

But that suggests we should set l_removed even earlier, before invoking
ELF destructors.

Thanks,
Florian



More information about the Libc-alpha mailing list