A patch for default version and archive

Ulrich Drepper drepper@redhat.com
Mon Nov 13 22:53:00 GMT 2000


"H . J . Lu" <hjl@valinux.com> writes:

> Do you agree that foo@@ver1 will resolve both foo and foo@@ver1?

First, how can there be a reference foo@@ver1?  This is not possible
in the versioning model except for ld.so (of course).  It is possible
only with this unintended loop hole where you define versions to
undefined symbols.  But this is not intended to work.  A version
always has a DSO associated with it.  I've already explained this on
occasion, the version information is a tuple.

In this context, what is an undefined, versioned symbol supposed to
be?  The DSO reference part is undefined.  The consequences of this
are not understood.

Now, iff (note the two f) we assume for a moment this would be
desirable, why should should it suddenly be possible to satisfy an
explicit reference with a non-versioned symbol?  An counter-example
can easily be constructed:

- you create a .o file with the .symver hack, referencing version ver2.
  The corresponding .so and .a have an appropriate definition (for ver1
  and ver2)

- you now link instead against an earlier version of the .a or DSO which
  only has one implementation and does not use versioning, therefore
  providing only the older (assume the ver1) implementation.

The linker would with your modifications successfully link although
this is wrong (otherwise there wouldn't be @ver2).


> Anything can be misused. It doesn't mean we should leave them broken.
> You may not like how it is used. But it is besides the point.

There is nothing broken, that's the point.


The original versioning stuff the Linux implementation is based on is
Sun's which only does versioning verification (not multiple versions).
We extended this is a way which preserves these properties and
additionally introduces handling of multiple versions of a function in
one file.  This is already stretching the initial design and if you
listen to some people who know a lot about ELF you'll see that they
are not immediately willing to adopt even this solution (you know what
I'm talking about).


Now with your changes you are dropping the benefits of having versions
enforced.  The whole concept as is only works for DSOs since DSOs (and
applications) contain not only the version names themselves but also
the references to the object the definitions were found in.  This
together with the information of the content of the DT_VERNEED section
verifies that the correct objects are used and allows to refuse
execution if these conditions are not met.  All this is missing if you
allow undefined symbols have versions.  There is no protection anymore.


What you do is something completely different.  You are trying to
allow creating the situation where you can create binaries which
appear to be linked against an older version of a DSO.  The hack
proposed for this *can* perhaps work in some limited situations (yes,
*now* I'm switching over to your other proposal).  But it's not a
general solution.  For instance, you always have to use the old header
files since some structure you are using might have changed (and was
the reason for a version number bump).  You cannot possibly use the
current headers and be sure everything is fine.

To resolve this you have two options:
1. add hacks à la
    #ifdef GLIBC_2_0_compatibility
     ...this definition
    #else
     ...that definition
    #endif
   to all the header files for all changes.  This will never happen
   to the glibc headers as long as I have something to say.  This
   makes them un-maintainable.

2. The other option is to have a separate tree of old headers.  Simply
   keep a tree of the old headers around.  That's fine with me and
   works flawlessly if the compiler gets the correct options passed
   and no library which depends on libc internals and which is not
   available in the duplicated tree with old version in linked in.

Now, following possibility 2 also means that there is not the
slightest problem with having a second libc.so with only the old
definitions.


Thinking more about it, there is another reason why not to use
undefined symbols with versions: there is no way to ensure all
involved files do it consistently.  Different .o files can reference
different versions.  The data structures passed between those .o files
can have different formats.  The results are again silent failures.
This cannot happen with the clear model where .o files (and therefore
.a files) cannot be reused in a new environment (e.g., after a libc
upgrade) if they contain any knowledge about the interface of the
upgraded library (e.g., libc).  If is no problem to keep .a/.o files
which only interface with the libc with values you file descriptors
etc.  But as soon as you are using something like stat() you are off
and have to provide a new version.  Compatibility of binaries and DSOs
is not effected as we know.


In passing by I mentioned another reason to not use this hack except
for very special situations where everything is under control.  You
would have to do the same for all DSOs and especially archive
available on the system (means, you have to prepare the headers for
those libraries and add versioning to the DSOs).  If another DSO or .a
includes code which somehow depends on the libc implementation
(defines a structure of a given size etc) the program using this DSO
or .a together with .o files with versioning information referencing
old versions can break.  And they break without the user being able to
test for it.


In summary: the versioning mechanisms as they are with the rule (which
was published right at the beginning) that .o and .a files cannot be
reused unless verified, provide a safe programming environment where
it is generally possible to assume the binaries use the correct
interface versions (since always the latest is used in new binaries).
There are some problems even here (passing references to old data
structures from an DSO to a caller which expect the new form) but this
is manageable.  At least we haven't seen problems yet.

By introducing versions for undefined symbols you remove all the
safety belts.  One can create binaries which are using whatever
interface there ever existed.  Maybe you cannot even use the interface
correctly since the appropriate interface definitions (prototypes,
data structures) are not available anymore or you cross-link different
versions.  This all leads to chaos.

The alternative is clear: provide a different tree with old headers
and old libraries.  Here I see room for improvement: there is not
really a convenient way to use different trees of libraries except by
having different gcc driver programs.  If you want to make this easier
I'd suggest to work on this.

- have gcc/g++/f77/gcj understand an option -api which is something like
  glibc-2.2 or whatever.

- via some configuration file this API version is mapped to a tree with
  include and lib directories

- this allows, on a glibc 2.1 system, -api=glibc-2.1 use the normal
  tree and on a glibc 2.2 system use a special tree with the old headers
  and libraries

- people who know they have to recompile and relink even after an update
  with the old API can right away (that's the key!) use this -api
  parameter.  Not only when it's too late and the new API is there.

- ld, by default, should probably not look beyond the lib dir in the
  subtree.  This should be selectable with an option as well.

- there are probably some more rules which will turn up when somebody
  implemented something like this.

By implementing something like this you achieve the same without
completely sacrifizing the current model which works fine.


Introducing handling of versions in archives etc is the first step in
the wrong direction and I can therefore not agree.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------


More information about the Binutils mailing list