This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][BZ #19329] Fix race between tls allocation at thread creation and dlopen

On 12/01/16 12:20, Ilya Palachev wrote:
On 11.01.2016 19:46, Szabolcs Nagy wrote:
On 11/01/16 16:42, Szabolcs Nagy wrote:
On 11/01/16 15:48, Ilya Palachev wrote:
How can you prove that it is working if the assertion that was failing is now just deleted from the code?

i can only prove that the assertion is wrong by
analysing the code: the condition it verifies
cannot be enforced with the current design.

removing it is harmless since the slotinfo entries
are lazy initialized.

Thanks for explanation.
Do you mean that slotinfo entries are lazy initialized in _dl_update_slotinfo ?

actually this is not true if dlclose may be called
concurrently with pthread_create as i noted in my
new patch description.

i don't know how to protect against that.

Does _dl_update_slotinfo has support for updating DTV's of other threads in case when some thread called dlclose?

_dl_update_slotinfo (and tls_get_addr_tail) also walk
the slotinfo list without synchronization like

that is a worse case because it happens at tls access
time not at a library call (pthread_create) time.

they initialize the dtv of the current thread based
on the global slotinfo list and might allocate memory
for tls, resize the dtv and may also hold the global
dl lock.. which are all problematic operations.

(these functions should only do as-safe operations
and must not do anything that can fail since, unlike
a library call, a tls access cannot report failures)

i did not try to fix them.

i only attempted to fix the pthread_create vs dlopen case.

As I can see dlclose just sets map pointer to NULL and increments the generation counter:

1. In remove_slotinfo function:

           listp->slotinfo[idx - disp].gen = GL(dl_tls_generation) + 1;
           listp->slotinfo[idx - disp].map = NULL;

2. In _dl_close_worker function:
   /* If we removed any object which uses TLS bump the generation counter.  */
   if (any_tls)
       if (__glibc_unlikely (++GL(dl_tls_generation) == 0))
         _dl_fatal_printf ("TLS generation counter wrapped!  Please report as described in "REPORT_BUGS_TO".\n");

       if (tls_free_end == GL(dl_tls_static_used))
         GL(dl_tls_static_used) = tls_free_start;

You have just fixed such memory accesses with atomic_load_acquire and atomic_store_release.
Won't these atomics be enough to fix races with dlclose? Or am I missing something?

i was looking at that and thought the memory accesses
might be possible to sync (although it's a bit hairy),
however dlclose frees the link_map at the end and that's
not possible to sync.

after _dl_allocate_tls_init loads .map and .gen from the
slotinfo it accesses the map assuming that won't go away.

the current design in glibc is very broken, e.g. musl can
do tls initialization and access in a lockfree way because
its dlclose does not free the loaded library, if it did
then i think using the dl lock would be necessary.

however the dl lock cannot (easily) fix the tls access
case and currently it does not fix the pthread_create case
either since the dl lock is held during ctors are running
so user code can deadlock on it.

tl;dr: if your libc has non-noop dlclose it must use a
global lock in dlopen/dlclose/thread creation/tls access
and user code must not run while that lock is held
(e.g. signals must be blocked)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]