[PATCH v3] elf: Make more functions available for binding during dlclose (bug 30425)
Florian Weimer
fweimer@redhat.com
Tue May 30 13:41:04 GMT 2023
* Szabolcs Nagy:
> The 05/30/2023 11:44, Florian Weimer via Libc-alpha wrote:
>> diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
>> index 05f36a2507..a8f48fed12 100644
>> --- a/elf/dl-lookup.c
>> +++ b/elf/dl-lookup.c
>> @@ -366,8 +366,25 @@ do_lookup_x (const char *undef_name, unsigned int new_hash,
>> if ((type_class & ELF_RTYPE_CLASS_COPY) && map->l_type == lt_executable)
>> continue;
>>
>> - /* Do not look into objects which are going to be removed. */
>> - if (map->l_removed)
>> + /* Do not look into objects which are going to be removed,
>> + except when the referencing object itself is being removed.
>> +
>> + The second part covers the situation when an object lazily
>> + binds to another object while running its destructor, but the
>> + destructor of the other object has already run, so that
>> + dlclose has set l_removed. It may not always be obvious how
>> + to avoid such a scenario to programmers creating DSOs,
>> + particularly if C++ vague linkage is involved and triggers
>> + symbol interposition.
>> +
>> + Accepting these to-be-removed objects makes the lazy and
>> + BIND_NOW cases more similar. (With BIND_NOW, the symbol is
>> + resolved early, before the destructor call, so the issue does
>> + not arise.). Behavior matches the constructor scenario: the
>> + implementation allows binding to symbols of objects whose
>> + constructors have not run. In fact, not doing this would be
>> + mostly incompatible with symbol interposition. */
>> + if (map->l_removed && !(undef_map != NULL && undef_map->l_removed))
>> continue;
>
> btw is there a valid use-case that goes wrong if the check is
> dropped completely? (keep binding to map when map->l_removed)
I think something like this is needed for useful diagnostics in
multi-threaded programs, where another thread mind bind lazily to an
object that is under removal. Usually, we'd record a relocation
dependency to prevent removal, but we can't do that once dlclose has
started for real. So without the l_removed check, we proceed to bind
the symbol, and crash during a later call. With the check and a
non-weak symbol, we terminate the process with an error message naming
the symbol at least.
But that suggests we should set l_removed even earlier, before invoking
ELF destructors.
Thanks,
Florian
More information about the Libc-alpha
mailing list