ELF linking question related to symbol collisions

Florian Weimer fweimer@redhat.com
Wed Dec 18 14:47:00 GMT 2013


On 11/21/2013 10:14 PM, Carlos O'Donell wrote:

> No, that's not the way it works. You must manage the global namespace
> or collisions will lead to incorrect runtime behaviour.

Oh well, thanks for confirming my suspicions.

>> So we have several desktop applications that have an ambiguous
>> reference to json_object_get_type, via the pulseaudio library.
>
> Yes, and that is a serious problem.

It turns out that there are further collisions between json-c and 
jansson, on the json_object_get and json_object_iter_next symbol, but we 
haven't binaries in Fedora that trigger it (based on the static 
DT_NEEDED information).

> The problem is mainly that they fall afoul of the ELF rules for
> interposition.

Right.  But apart from the occasional LD_PRELOAD and the language 
interpreter optimization (which Fedora doesn't use—you compile your main 
Python/Perl/etc interpreter binary with a statically linked non-PIC copy 
of the actual interpreter implementation, exported dynamically, and your 
extension modules link against that instead of the interpreter DSO), I 
don't think there is actual use of that.

Or more to the point, I'm not sure if the current linking algorithm is 
what we need.  Something like -B direct might be a better fit for our 
current needs.

>> The trouble with this is that's fairly difficult to detect. Static
>> analysis misses collisions introduced by dlopen and dlsym.
>
> Right, in this case you need a special-purpose analysis tool to
> catch this, something that models the dynamic linker and ELF.

I have a pretty good approximation, but exclusively based on data that 
is statically available (and some heuristics to avoid implementing rpath 
resolution or /etc/ld.so.conf.d/ handling).  The amount of data is 
fairly large, so I haven't tried to detect actual symbol interposition.

> Symbol collisions are only bad if both symbols do not implement the
> same ABI and API. If they do implement the same ABI and API then it's
> a replacement function that is safe to interpose.

Unless that function references static symbols, then it may or may not 
be safe.  If everything ends up being interposed, it is okay, but if 
not, code might hit different static symbols.

> Symbol versioning does not solve the problem in general either since
> then you need a global version name management, and you need to fix
> all applications to use versioning which is a huge amount of work.
> Even then you can still have problems if the projects lack the rigour
> required to update their version maps.

I had hoped it would be possible to use a static map with a single 
version-like string, like JSON-C, JSON-GLIB, and JANSSON.  Maybe we 
could even use the soname for this by default.

This is easier than renaming everything under a shared prefix, and would 
not affect backward compatibility.  Perhaps I'm a bit naïve, but I 
suspect this could be used to implement part of the linking semantics of 
-B direct and reap some of the performance benefits because it is easier 
to locate the correct hash table.

> A different linking algorithm isn't helpful either because at static
> link time you don't need the devel libraries for all dependent libraries,
> and requiring it would make compiling anything much more complicated.

Sure, but I think static linking has no future.  If it reappears, it 
will be in the form of LTO object files.

> The only robust solution I see is a post-build tool that looks for
> global namespace collisions and rejects the build if they exist.

That's difficult to do because anything that needs to consider more than 
one piece of software in isolation has a high overhead, both 
performance-wise and administratively.

If we decide to implement namespace management outside of the toolchain, 
I think we should have a list of symbols and symbol prefixes that map to 
library (soname, and then OS package) that defines them.  If a library 
(or dynamic executable) uses a symbol outside that list, we'd either 
have to fix the list or address the unintentional out-of-namespace 
symbol leakage.  Both measures can be taken even before any actual 
collision materializes, and it's only

> The
> workaround might be to register your allowed symbol interpositions in
> the spec file such that the post-build tool can use those to resolve
> such allowances. Note that just stating that symbol X may be interposed
> is not sufficient to make this system safe, you must say symbol X from
> SONAME Y may interpose.

I think it should be outside the SPEC file because of the syntax issues 
(no good parsers, parsing requires arbitrary code execution etc.).

> What's your next step?

If there's a strong desire to implement -B direct, that's the way to go, 
but it's a bit out of my area of expertise.  If there isn't, I'll look 
into writing an out-of-band namespace model for the upcoming Fedora Base 
package set.

-- 
Florian Weimer / Red Hat Product Security Team



More information about the Libc-help mailing list