A race condition could occur between the pthread_key_delete() and the __nptl_deallocate_tsd(). For instance, __nptl_deallocate_tsd() could call a destructor for the key, immediately before the pthread_key_delete() invalidates it (from an another thread), and will continue destructor execution after the completion of pthread_key_delete(). From a user code this looks as if the corresponding destructor executes after the key has been removed by pthread_key_delete(), and there is no way to know whether was destructor called/executed or not. Suggest add pthread_rwlock_rdlock() for __nptl_deallocate_tsd() and pthread_rwlock_wrlock() for pthread_key_delete().
Related to bugs 18136, 21032.
Not a bug for glibc, but the feature )
Reopened because it still needs discussion.
This "issue" has been encountered in pipewire[0]. The fact that destructors may be called after `tss_delete()` / `pthread_key_delete()` returns makes it quite hard to use. Is there maybe a simple to way to safely delete the TSS key, and then destroy all remaining objects that weren't processed by the TSS destructor? One solution appears to be moving the TSS key into a reference counted allocation and deleting the TSS key after the last thread that has data in that slot has exited. But this seems pretty suboptimal since it takes up one TSS slot completely unnecessarily for a potentially unbounded amount of time. Another one is building one's own TSS API on top, just like chromium does (although my reading of the code suggests that it has the same problem, but of course they can easily change it if it turns out to be an actual issue for them). Neither options seem appealing for pipewire, though. I also realize that neither POSIX nor the C standard says anything about this, still, I can't help but think that the current status quo makes this TSS API nigh unusable if one wants to actually avoid resource leaks. Unless I am (hopefully) missing something obvious? [0]: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4356
The current glibc implementation treats keys as a very scarce resource, there are just 1024 of them in total. They are really not designed to be allocated on a per-data-structure basis. I don't know what pipewire tries to accomplish here. I found a reference to an issue: multiple data loops vs. invokes <https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/3983> But there's no discussion about the design of the commit that addresses this issue.
Yes, arguably pipewire's use of TSS slots is not optimal. But even even if we assumed that pipewire wanted to allocate just a single slot in `pw_init()` (or at library load time), and free it in `pw_deinit()` (or at library unload time), the issue is still there.
(In reply to Barnabás Pőcze from comment #6) > Yes, arguably pipewire's use of TSS slots is not optimal. But even even if > we assumed that pipewire wanted to allocate just a single slot in > `pw_init()` (or at library load time), and free it in `pw_deinit()` (or at > library unload time), the issue is still there. Unloading can be disabled by linking with -Wl,-z,nodelete. That's appropriate for libraries which cannot safely be unloaded. Keep in mind that pthread_key_delete will not be able to invoke key destructors on threads that have previously called pthread_setspecific. This aspect is not something that can change. Based on the limited information I have, what the library tries to do looks like an anti-pattern to me: There's no proper lifetime management for some resource, so its lifetime is extended to the lifetime of the calling thread. That tends to have surprising consequences once thread pools are involved. We see this internally with glibc for our malloc arenas, for example. This is a conceptual problem, it persists even if we change pthread_key_delete.
The core issue here is how one is expected to safely use `pthread_key_delete()`. The current glibc and musl implementations do not synchronize destructor calls caused by thread termination with calls to `pthread_key_delete()`. This makes it possible for a destructor to be called in a terminating thread after `pthread_key_delete()` returns. For this reason, it is not feasible to delete the key, then iterate over a list of all thread local objects to free the remaining ones because that might race with thread termination and lead to double frees. So I am wondering what the expected usage of `pthread_key_delete()` is, because to me this behaviour seems to be a significant oversight in the current design, unless I am (hopefully) missing something.