This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR19826] fix non-LE TLS in static programs

On Sat, 2016-09-24 at 00:44 -0300, Alexandre Oliva wrote:
> In this patch, I add a long comment explaining why we can't modify
> another threads' DTV while initializing its static TLS block during
> dlopen.

I appreciate that you added a comment.  Given that you seemed to have
paged in all of this currently, it would be even better if you could try
to further improve the documentation of all of this.  (I suppose missing
the issue you fix here would have been less likely if proper
documentation existed before.)  Having the patch being reviewed by
someone who's not deeply familiar with the TLS implementation would be
good, as it shows whether this really makes the code more accessible
(and thus reducing the bus factor...).

Are you aware of any synchronization in the TLS stuff that is not based
on locks?  That is, were there are data races as defined by C11?
If so, it would be great if you could transform them to using the
new-style C11 atomics.  I've reviewed nscd concurrency recently and
converted it to use C11 atomics, and this process revealed many
synchronization bugs such as missing barriers.

> +     (**) If we were to write to other threads' DTV entries here, we
> +     might race with their DTV resizing.  We might then write to
> +     stale, possibly already free()d and reallocated memory.  Or we
> +     could write past the end of the still-active DTV.  So, don't do
> +     that!

It's okay to point out why data races are bad, but you probably don't
need it.  I think we have consensus that every data race is bad and
should be fixed (with very few exceptions, which then need lengthy
explanations).  The general understanding should be that there's no such
thing as a benign data race.

> +     Another race, not to be mistaken for the above, has to do with
> +     the TLS block initialized below.  Although we perform this
> +     initialization while holding the rtld lock, once dlopen
> +     completes, other threads might run code that accesses the
> +     variables in these TLS blocks, without taking the rtld lock or
> +     any other explicit memory acquire.

What is a "memory acquire"?

(If you should be assuming that acquire operations synchronize-with
every other release operation such as a lock release, then this is
wrong; remember that it is pairs of release/acquire action that are
linked by a reads-from, so both release and acquire must target the same
memory location.)

> +     Our implementation appears to assume that, in order for the other
> +     threads to access those variables, there will be some form of
> +     synchronization between the end of dlopen() in one thread and the
> +     use of the just-loaded modules in others, otherwise the very use
> +     of the loaded code could be undefined.  */

I'd reverse that: I find it misleading that you start that there is a
race condition, when in fact this seems not to be the case because of
the assumption.  Therefore, I'd mention the assumption first, and then
derive from that there is no data-race and that uses of the TLS will
find it initialized.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]