This is the mail archive of the
mailing list for the glibc project.
Re: [BZ#17090/17620/17621]: fix DTV race, assert, and DTV_SURPLUS Static TLS limit
- From: Alexandre Oliva <aoliva at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 19 Nov 2014 17:56:45 -0200
- Subject: Re: [BZ#17090/17620/17621]: fix DTV race, assert, and DTV_SURPLUS Static TLS limit
- Authentication-results: sourceware.org; auth=none
- References: <ormw7ol1sf dot fsf at free dot home> <20141118203338 dot ECA5F2C3B25 at topped-with-meat dot com> <ord28kkvpq dot fsf at free dot home> <20141118224048 dot 600312C3B23 at topped-with-meat dot com>
On Nov 18, 2014, Roland McGrath <email@example.com> wrote:
>> No, but I can confirm that, after this change, td_thr_tlsbase may return
>> as much garbage for Static TLS modules as the current code may for
>> dynamic TLS modules, since it doesn't check generation counts.
> That sounds like you're saying your change causes a regression, which
> incidentally has the same failure mode as an existing bug for a
> different circumstance.
The lack of docs prevents from from concluding it's a regression, rather
than a change in behavior within the intended boundaries, and the lack
of special-casing for Static TLS in the implementation doesn't enable me
to conclude it's a different circumstance, but I'd trade run-time TLS
failures, memory corruption and memory leaks for more frequent garbage
out of nptl_db any time.
Anyway, if we had to choose between the two, I'd easily go with the
patch that makes this tradeoff. (Not that I had *meant* the patch as a
trade-off, mind you. I didn't realize nptl_db could be affected by this
patch, so I only looked into it after you pointed it out. So I thank
you for catching this, regardless of whether we agree on whether the
change in behavior is a regression or a legitimate change within the
(un?)defined boundaries of behavior for that function.)
Fortunately, we don't have to make that choice; we can have the fix for
the TLS bugs, and we can have reliable TLS accessors in nptl_db. There
is a caveats, however: we need some means to find a module's DTV
generation and Static TLS offset, given a modid. The offset can be
obtained from the link_map, whereas the generation and the link_map
pointer can be obtained from the dtv_slotinfo_list. However, we have no
immediate means to get ot either of these data structures.
The main difficulty I'm yet to address (the rest is already done; I'll
gladly share the WIP patch if anyone's interested) is that
_dl_tls_dtv_slotinfo_list is defined under different names for static
and dynamic programs, a stand-alone symbol for the former and a
_rtld_global member in the latter, and we can't assume from say dynamic
nptl whether we're gonna be dlopened by a static libdl, or by the
dynamic ld.so, and a libpthread.a might be linked into a dynamic
executable, or even into a dlopened library. I'm not sure which, if
any, of these cases we regard as unsupported, but I see difficulties in
handling them all.
Plus, nptl_db seems to be only equipped to look up symbols in libpthread
proper, and nothing in there links back to _rtld_global or its
corresponding standalone symbols.
I see a number of possibilities to overcome this:
- add a symbol to libpthread that, during early initialization, is set
to points to the head of the dtv slotinfo list. I'm not sure yet how it
would be initialized though to work under all the situations above.
- add to link_maps a pointer to the corresponding slotinfo entry, so
that we can get ahold of the generation starting from the link_map.
This would enable at least td_thr_tls_get_addr to DTRT; td_thr_tlsbase,
taking only modids, would require some means to map those back to
link_maps to DTRT.
- iterate over the link_map list to locate a module with the looked-for
modid. Horribly inefficient, but lacking other means, this could be a
viable last resort. We'd still need gen counts added to link_maps
- special-case the _dl_tls_dtv_slotinfo_list lookup so that we can find
it both as a member of _rtld_global, defined in ld.so, and as a
stand-alone symbol defined in the main static executable. (Can
ps_pglobal_lookup be generally used to look up a symbol globally, if
given a NULL object_name or somesuch? I can't find documentation for
this interface, and we only use it to look up symbols in libpthread_so.
Plus, what if we can't get to the slotinfo_list, say, because the static
executable is stripped and it hasn't exported this dynamic symbol we
- other possibilities I haven't considered?
> td_the_tlsbase should return success and yield the correct pointer in
> any circumstance where the TLS block for the module and thread requested
> has already been initialized. It should fail with TD_TLSDEFER only when
> the thread could not possibly have observed any values in that TLS
> block. That way, the debugger can fall back to showing initial values
> from the PT_TLS segment (and refusing attempts to mutate) for the
> TD_TLSDEFER case, and never fail to make the values the program will
> actually see available to the user of the debugger.
This sounds like a very reasonable goal. Is this documented anywhere?
If not, may I paste your paragraph as a comment next to the function
definition? Or to its declaration?
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer