Bug 25486 - RFE: Do not rely on symbol interposition from libc.so into ld.so
Summary: RFE: Do not rely on symbol interposition from libc.so into ld.so
Status: ASSIGNED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Florian Weimer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-31 05:59 UTC by Lorinczy Zsigmond
Modified: 2021-04-26 06:25 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2020-02-01 00:00:00
fweimer: security-


Attachments
demo module for reproduction (C source) (575 bytes, text/C-source)
2020-01-31 12:39 UTC, Lorinczy Zsigmond
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lorinczy Zsigmond 2020-01-31 05:59:19 UTC
Hi, I think I found a situation where 'ld-linux-x86-64.so.2' exports its 'calloc' implementation (dl-minimal.c) but import 'malloc' from 'libc.so.6'.

Now 'ld-linux:malloc' returns the memory zeroed, but 'libc:malloc' returns the memory dirty.

The result is a 'calloc' implementation which returns dirty memory.

The problem is described in details on these places:
https://stackoverflow.com/questions/59956996/problem-loading-a-library-with-ffi-in-php-7-4
https://www.linuxquestions.org/questions/programming-9/debugging-dlopen-4175668676/#post6084584
and also here the same issue in a different context:
https://stackoverflow.com/questions/39760479/why-does-calling-calloc-in-gdb-not-appear-to-zero-out-the-memory

Details:
User wants to load an external plugin into PHP via libffi. The plugin is linked with wrong dependency list (here: https://stackoverflow.com/questions/59956996/problem-loading-a-library-with-ffi-in-php-7-4#59994038 ). Also the plugin depends on glib-2.0.so.

All these components stacked together result in glibs.so-2.0's constructor calling ld-linux:calloc (which in turn calls libc:malloc) gets back dirty (not zeroed) memory.

Running it with DL_DEBUG=all shows the event nicely (added line-number, removed pid):

173486       relocation processing: /lib64/ld-linux-x86-64.so.2
173592       binding file /lib64/ld-linux-x86-64.so.2 [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `malloc' [GLIBC_2.2.5]
173633       binding file /lib64/ld-linux-x86-64.so.2 [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `calloc' [GLIBC_2.2.5]
...
184669       relocation processing: /usr/local/lib64/libglib-2.0.so.0 (lazy)
185310       symbol=calloc;  lookup in file=./libsomething.so [0]
185311       symbol=calloc;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
185312       binding file /usr/local/lib64/libglib-2.0.so.0 [0] to /lib64/ld-linux-x86-64.so.2 [0]: normal symbol `calloc' [GLIBC_2.2.5]
...
210561       calling init: /usr/local/lib64/libglib-2.0.so.0
210614       symbol=calloc;  lookup in file=./libsomething.so [0]
210615       symbol=calloc;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
210616       binding file /usr/local/lib64/libglib-2.0.so.0 [0] to /lib64/ld-linux-x86-64.so.2 [0]: normal symbol `calloc' [GLIBC_2.2.5]

Suggested solution: ld-linux-x86-64.so shouldn't export symbols that exist in libc.so.6 (okay, they are 'Weak symbols' but that didn't prevent this problem); but if it does, its 'calloc' implementation call do zero the memory it returns.
Comment 1 Florian Weimer 2020-01-31 10:45:54 UTC
You will have to adjust the link order. We can change this in the dynamic loader (and it may even be a useful change because it makes setting breakpoints on the real malloc easier), but I doubt that distributions would backport this, the current behavior will remain for the foreseeable future.
Comment 2 Lorinczy Zsigmond 2020-01-31 12:39:10 UTC
Created attachment 12241 [details]
demo module for reproduction (C source)

Compilation:
    gcc -I/usr/local/include/glib-2.0 -o demodule.so -shared \
	demodule.c -fPIC -DPIC /lib64/ld-linux-x86-64.so.2 -lglib-2.0

Call-module in PHP:
#!/usr/local/bin/php
<?php
    $ffi = FFI::cdef('', './demodule.so');
?>

Test#1
$ php demodule.php
demodule.init: calloc is at 0x7f6351b80b40 returned 0x2532df0 first 8 bytes: 7f634f22d698

Test#2:
$ cat demodule.php | php
demodule.init: calloc is at 0x7f6b6369eb40 returned 0x32b1f00 first 8 bytes: 0

It show that it is undeterminisctic: in the second test the memory happens to be zeroed, but that's only accidental not warranted.
Comment 3 Lorinczy Zsigmond 2020-01-31 12:47:03 UTC
(In reply to Florian Weimer from comment #1)
Thank you for your quick reply -- I don't really like to be a proxy, but the original problem-owner has just reported that his shared object has been linked with Free Pascal, and he doesn't really know what should be changed.
Comment 4 Lorinczy Zsigmond 2020-01-31 15:14:28 UTC
On the other hand, if you don't change this in the dynamic linker, the current behavior will remain for good. Please examine the demo program I attached, and if you find it believable, please change the status to VERIFIED.
Comment 5 Carlos O'Donell 2020-01-31 18:24:09 UTC
(In reply to Lorinczy Zsigmond from comment #2)
>     gcc -I/usr/local/include/glib-2.0 -o demodule.so -shared \
> 	demodule.c -fPIC -DPIC /lib64/ld-linux-x86-64.so.2 -lglib-2.0
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
This usage of linking against the system ld.so is not correct. It's not clear what the purpose of this is, or why it's there.

(In reply to Lorinczy Zsigmond from comment #4)
> On the other hand, if you don't change this in the dynamic linker, the
> current behavior will remain for good. Please examine the demo program I
> attached, and if you find it believable, please change the status to
> VERIFIED.

It is a link order issue.

You must link with libc.so first. 

If this is an issue with the Free Pascal compiler then please file a bug there.

The dynamic loader is capable of exporting any interfaces it needs for the implementation, and it should always be linked to after the C library implementation. There is nothing wrong with the loader exporting this global symbols in the dynamic symbol table.

As Florian notes it would be more robust to remove the symbols from the dynamic loader, but that's not something we're going to do right now.

If you feel strongly about this robustness issue then we can rework this ticket to be an RFE for ld.so that includes removing interfaces to avoid accidental linkage. However, such an issue is very low priority and unlikley to get fixed unless somone has the explicit interest in fixing it.

I appreciate the bug report, and posting here is the right way to get feedback on bugs like this. Thank you for the report.
Comment 6 Lorinczy Zsigmond 2020-02-01 06:17:44 UTC
(In reply to Carlos O'Donell from comment #5)

Hi, thank you for your reply.

I agree, the concrete problem wouldn't have occurred if it weren't for the wrong linkage.

Also, I agree, the concrete problem has to be solved via re-linking the shared object in question.

Also, I agree, quick action on glibc-side is neither needed or possible.

Nonetheless, this problem (the original problem, not my demo) is occurred in the wild, it wasn't fabricated to make ld-linux look bad.

Also google-search for +"dl-minimal.c" +"inconsitency detected" suggests it was not the first time.

Also I don't see in the documentation "don't explicitly link shared objects against ld-linux-so or else bad things might happen (under some rare conditions)".

So I think this 'calloc' in 'ld-linux' is a weak point which should be eventually removed (e.g. made it unexported).

> If you feel strongly about this robustness issue
> then we can rework this ticket to be an RFE for ld.so
> that includes removing interfaces to avoid accidental linkage.
> However, such an issue is very low priority and unlikely to get fixed
> unless somone has the explicit interest in fixing it.

That would be okay with me, thank you.

Regards: Lorinczy Zsigmond
Comment 7 Carlos O'Donell 2020-02-01 06:39:42 UTC
Reopening as RFE to cleanup ld.so exported malloc API symbols.
Comment 8 Florian Weimer 2020-02-08 19:19:12 UTC
Patches posted: https://www.sourceware.org/ml/libc-alpha/2020-02/msg00176.html
Comment 9 Sourceware Commits 2020-02-15 10:05:08 UTC
The master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3a0ecccb599a6b1ad4b149dc569c0080e92d057b

commit 3a0ecccb599a6b1ad4b149dc569c0080e92d057b
Author: Florian Weimer <fweimer@redhat.com>
Date:   Sat Feb 8 19:58:43 2020 +0100

    ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]
    
    Exporting functions and relying on symbol interposition from libc.so
    makes the choice of implementation dependent on DT_NEEDED order, which
    is not what some compiler drivers expect.
    
    This commit replaces one magic mechanism (symbol interposition) with
    another one (preprocessor-/compiler-based redirection).  This makes
    the hand-over from the minimal malloc to the full malloc more
    explicit.
    
    Removing the ABI symbols is backwards-compatible because libc.so is
    always in scope, and the dynamic loader will find the malloc-related
    symbols there since commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c
    ("ld.so: Support moving versioned symbols between sonames
    [BZ #24741]").
    
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Comment 10 Joost van der Sluis 2020-02-20 15:39:37 UTC
I still have a question, though. 

Most applications compiled by fpc (the Free Pascal Compiler) do not depend on libc on Linux targets.

But long time ago we discovered that when an application links to a library, it also has to link to libc or ld-linux. If you don't do this, it may lead to vague problems.

That is why fpc-applications link with ld-linux. The statement that you never have to do this is simply not true.

What do you suggest that we do, now?

What we could do is: link with ld-linux only when libc is not used. This will probably solve the issue. But what would be the 'correct' solution, from your point of view?
Comment 11 Lorinczy Zsigmond 2020-02-20 16:09:47 UTC
Hi, I looked at fpc-SVN where these lines have been added; I think it means the shared object created by fpc has to depend on _some_ other shared object (doesn't matter what shared object), to prevent some problem during finalization. So it doesn't actually imports anything from ld-linux.so, it just needs a dependendent share-object.

-------------------------------------------------------------------------------
r19036 | tom_at_work | 2011-09-08 23:17:35 +0200 (Thu, 08 Sep 2011) | 27 lines

Fix shared library loading and unloading for Linux platforms.
Shared library initialization and finalization are now called correctly
at program startup for compile-time linked dynamic libraries on
powerpc-/powerpc64-/arm-/i386- and x86_64-linux.

Every startup code must now provide an additional entry point called "_dynamic_start"
that is set as new the entry point if the program links to a Pascal shared library.
Its purpose is to set up an exit hook usually passed via a register,
which should be called during program finalization if non-nil.

We use this additional entry point because this register only has meaningful content
when there are any compile-time linked shared libraries, otherwise it often contains random garbage.
The difference between the _dynamic_start and the original code is minimal;
actually in all implementations the _dynamic_start code passes on control to the old startup code,
so we use an additional entry point instead of an additional startup file.
Comment 12 Florian Weimer 2020-02-21 12:08:32 UTC
I don't know enough about the fpc toolchain. Does it use binutils ld and the gcc compiler driver? Or does it implement its own compiler driver and linker? Does fpc link with the startup files?

In general, there will be compatibility issues if you do not link with the startup files, and quote for r19036 suggests that this is what you are working around. But I suspect that fpc over-links against ld.so for no good reason. You will need a DT_NEEDED entry for libc.so.6 for correct ordering of initializers, but the DT_NEEDED entry for the dynamic linker is not required for this (and actually counterproductive for currently released glibc versions).
Comment 13 Lorinczy Zsigmond 2020-02-24 13:46:18 UTC
Well, fpc doesn't use gcc nor glibc (unless `USES initc` is used, which adds libso.6 as dependency);  but it does use the platform-linker.

It can create executables (when the source begins with PROGRAM), and shared objects (when the source begins with LIBRARY); the problem occurs when linking shared objects: an unnecessary dependecy to loader-linker is added.
Comment 14 Florian Weimer 2020-02-24 14:18:56 UTC
This patch is also needed: https://www.sourceware.org/ml/libc-alpha/2020-02/msg00722.html
Comment 15 Sourceware Commits 2020-02-26 15:54:53 UTC
The master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=758599bc9dcc5764e862bd9e1613c5d1e6efc5d3

commit 758599bc9dcc5764e862bd9e1613c5d1e6efc5d3
Author: Florian Weimer <fweimer@redhat.com>
Date:   Wed Feb 26 15:58:23 2020 +0100

    elf: Apply attribute_relro to pointers in elf/dl-minimal.c
    
    The present code leaves the function pointers unprotected, but moves
    some of the static functions into .data.rel.ro instead.  This causes
    the linker to produce an allocatable, executable, writable section
    and eventually an RWX load segment.  Not only do we really do not
    want that, it also breaks valgrind because valgrind does not load
    debuginfo from the mmap interceptor if all it sees are RX and RWX
    mappings.
    
    Fixes commit 3a0ecccb599a6b1ad4b149dc569c0080e92d057b ("ld.so: Do not
    export free/calloc/malloc/realloc functions [BZ #25486]").