Fixing a scalability issue in OpenSSL error reporting

Ondřej Bílka
Tue Jun 16 16:46:00 GMT 2015

On Tue, Jun 16, 2015 at 03:22:30PM +0200, Florian Weimer wrote:
> OpenSSL has its own implementation of thread-local variables (using a
> few global locks and a hash table indexed by the address of errno), and
> the error state is needed in a few places even if no error occurs which
> is visible at the application level.  This turns out to be a major
> scaling issue if you have more than a few hundred OpenSSL connections in
> one process (I don't know if it applies to servers as well).
> Unlike errno, the error state is not of fixed size, so ideally, a
> deallocation function would run, releasing the state if the error
> information isn't collected before the thread terminates.
> The OpenSSL implementation has a function to deallocate the error state
> of *another* thread, but that's obviously racy and would have to be
> turned into a NOP.
> I have come up with several potential approaches:
> (a) Use __thread (or C++11 thread_local with a POD) and do not add a
> deallocation function.  The downside is a potential memory leak, as
> mentioned above.  The existing code already has this problem, though.
> Advantage is good portability to older GNU toolchain versions.
> (b) Use pthread_setspecific and related functions.  This should offer
> even better portability, and there is a destructor function which can
> deallocate memory.  The downside is that it currently requires linking
> against libpthread, which is something I want OpenSSL cease to do.  A
> fully portable solution with pthread_once may lack performance, and
> portable atomics more or less require a C++11 compiler outside of the
> GNU ecosystem.
> (c) Use C++11 thread_local.  This requires linking against libstdc++.  I
> don't know if this could have adverse consequences, comparable to
> linking against libpthread.  Portability will increase over the time,
> something that seems unlikely for (a) and (b).
> Solutions involving C++11 might be a difficult sell for OpenSSL
> upstream, but I prefer it over reimplementing TLS destructors from
> scratch.  The old OpenSSL TLS implementation would still stay around, so
> perhaps it's acceptable to compile just one file with a C++11 compiler.
>  That's why I'm leaning towards (c), but I'm not sure about the impact
> of the libstdc++ dependency.
As you write that it doesn't have fixed size whats exact format?
Couldn't you just bound that by 256 bytes or something like that.

Also why do you want avoid pthread? You need to call pthread_key_create
somewhere. Couldn't you first check with 
dlsym (RTLD_DEFAULT, "pthread_key_create") if threads are in use and add
constructor only when they are? Use __thread variable anyway and
destructor to free it instead.

> I also noticed that pther_setspecific destructors do not run for the
> main thread.  Is this a bug?

More information about the Libc-help mailing list