This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Evolution of ELF symbol management


On 11/18/2016 10:48 AM, Florian Weimer wrote:
> On 11/16/2016 04:55 PM, Zack Weinberg wrote:
> 
>> First, I don't have a problem with adding __libc_* aliases as
>> non-default but public alternative names, so that third-party libraries
>> can avoid namespace pollution in static linkage and/or choose to refer
>> to symbols that are more likely to get resolved to the C library's
>> export than to a colliding symbol in the main executable.  I'd reserve
>> that for cases where there is already a demonstrated need, though, such
>> as libstdc++.  I also don't have a problem with a hypothetical policy
>> that all new symbols that aren't in the current revisions of C+POSIX
>> should be given an impl-namespace name as their primary definition, with
>> the user-namespace name being a weak alias (but still the only name used
>> in the public headers).
> 
> If we don't declare the library-safe names in headers, how can libraries
> call them?

We could _declare_ the library-safe names in the headers, just not as
the primaries.  Like how string.h currently declares both bzero and __bzero.

Incidentally, it occurrs to me that the user-namespace name must exist,
for the sake of people using dlsym(RTLD_DEFAULT, "whatever") to access
symbols that they anticipate existing in future revisions of libc
(relative to the one they used at link time).

> I also do not want to encourage application or library code to reference
> the implementation namespace at the source code level.  It's ugly, and I
> suspect it encourages implementation namespace pollution once
> programmers are used to it.

I don't like it either, but how else could a library's headers opt into
these special names on a per-symbol, per-use basis?

Come to think of it, to actually avoid polluting the user namespace, any
library that wants to use these will need a secondary set of libc
headers that declare _only_ the private names.  (This is especially
relevant for C++ with so much code in headers.)  If we don't do that,
the user-namespace libc prototype (which still exists under your plan)
might conflict with an unrelated application definition.

I don't like _that_, because now we have to maintain multiple copies of
the same set of prototypes with different names *in different headers*.
But again, I don't see an alternative in C.  Maybe we could get
'namespace', 'using', and 'extern "C++"' added to C as a GCC extension?
That would _help_.  In fact, that would solve all kinds of problems.
(But we'd have to give up the pretense that our headers work with
anything but GCC.)

>> I _do_ have problems with causing these symbols to be used by default.
>> My main concern is that I think it's tackling the problem in the wrong
>> place.  If we want to back away from the original principle that the
>> executable always wins, isn't an expanded version of -Bsymbolic mode
>> what's _really_ wanted?
> 
> We could cover a lot of ground if we had a new flag on versioned symbol
> definitions which tells the static linker to set a flag on the versioned
> symbol reference, and the dynamic linker would then use this flag to
> ignore unversioned symbols for binding symbols.

I was imagining a new annotation on _all_ undefined symbols in a shared
object, giving the soname of the object that they were satisfied by at
link time.  At load time, 'getrandom!libc.so.6' resolves to the
'getrandom' definition in libc.so.6, ignoring all other definitions of
the same name.  If there are symbol versions involved, only the versions
exported by libc.so.6 are considered.  For instance,
'getrandom!libc.so.6@GLIBC_2.25' cannot be satisfied by
'getrandom@GLIBC_2.25' exported by libmissing-syscalls.so.1.

> (The static linker currently does not add the version of the interposed
> symbol when interposition happens at static link time.)

I don't understand this statement, and that makes me worry that you are
trying to solve a different problem from the one I thought you were
talking about -- a problem that I might not even know exists.  Can you
elaborate, please?  How can a shared object be interposed upon at static
link time?  Its own static link has already happened!

> *However*, this is hardly a complete solution.  It does not cover symbol
> references from public C++ headers because there is no static linker
> invocation that comes between application use of the symbol and the use
> from system headers.

See above.

> It also does not work for static libraries.  In those cases, we could
> perhaps do some post-processing to add the symbol versions to the .a
> files, but it would still need the interposition protection mentioned
> above *and* changes to how we build static libraries.

Yes, to apply my proposal to static libraries, a new linking step would
be needed, functionally the same as the one that already exists for
shared libraries, but with different effects -- either adding a new
archive member with a bunch of annotations, or rewriting all the .o
files with those annotations.

Frankly I don't care very much about static libraries; I'd be fine with
allowing them to continue to work as they do now (it is already the
case, for instance, that the executable can interpose on internal libc
symbols in a static link that are inaccessible to it in a shared link).

> The header files could contain the symbol version and instruct GCC to
> put it into the header file.  (Even today, it is possible to link
> against specific symbol versions without a custom DSO containing them as
> the default version, but please don't tell anyone.)  But this would
> still need the interposition protection.  I still think this is the most
> promising option, all things considered.
> 
> But then we need to step back and ask ourselves: If we have to put the
> versioning information in the header, why do we even need symbol
> versioning?  Why can't we version the interface through its name?
> 
> Hence my proposal.

You keep talking about symbol versioning but, again, I don't understand
how symbol versioning is relevant, and that makes me worry that you're
trying to solve a different problem that I don't even know about.  I
thought the issue here was controlling *which library* provides a
symbol, independent of whether the symbol has versions.

As for what appears in the headers -- again, see above: to solve the
libstdc++/_GNU_SOURCE problem, which I believe I *do* understand, there
needs to be a set of alternative names that libstdc++ headers can
*explicitly* refer to; otherwise we have still polluted the user
namespace.  REDIRECTed declarations do not solve that problem.  In fact,
with my proposed which-library annotations, we could perfectly well
redirect the mangled names to the normal names:

extern ssize_t __libc_getrandom (void *, size_t, unsigned int)
  __asm ("getrandom") __attribute__ ((__bind_to_library ("libc.so.6")));

#ifdef __USE_GNU
extern typeof (__libc_getrandom) getrandom;
#endif

----

All this aside, this discussion is still very brainstormy and that makes
me think that we should *not* yet be supplying mangled names for public
use.  Once we start doing that we are stuck with it forever, after all.
Contrariwise, we *can* always retrofit __libc_* aliases or whatever once
we know what we ought to be doing.

Similarly, this should not be a blocker issue for new 2.25 features. (On
that note, cc:ing Siddhesh with their release manager hat on.)

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]