Hi, I think I found a situation where 'ld-linux-x86-64.so.2' exports its 'calloc' implementation (dl-minimal.c) but import 'malloc' from 'libc.so.6'. Now 'ld-linux:malloc' returns the memory zeroed, but 'libc:malloc' returns the memory dirty. The result is a 'calloc' implementation which returns dirty memory. The problem is described in details on these places: https://stackoverflow.com/questions/59956996/problem-loading-a-library-with-ffi-in-php-7-4 https://www.linuxquestions.org/questions/programming-9/debugging-dlopen-4175668676/#post6084584 and also here the same issue in a different context: https://stackoverflow.com/questions/39760479/why-does-calling-calloc-in-gdb-not-appear-to-zero-out-the-memory Details: User wants to load an external plugin into PHP via libffi. The plugin is linked with wrong dependency list (here: https://stackoverflow.com/questions/59956996/problem-loading-a-library-with-ffi-in-php-7-4#59994038 ). Also the plugin depends on glib-2.0.so. All these components stacked together result in glibs.so-2.0's constructor calling ld-linux:calloc (which in turn calls libc:malloc) gets back dirty (not zeroed) memory. Running it with DL_DEBUG=all shows the event nicely (added line-number, removed pid): 173486 relocation processing: /lib64/ld-linux-x86-64.so.2 173592 binding file /lib64/ld-linux-x86-64.so.2 [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `malloc' [GLIBC_2.2.5] 173633 binding file /lib64/ld-linux-x86-64.so.2 [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `calloc' [GLIBC_2.2.5] ... 184669 relocation processing: /usr/local/lib64/libglib-2.0.so.0 (lazy) 185310 symbol=calloc; lookup in file=./libsomething.so [0] 185311 symbol=calloc; lookup in file=/lib64/ld-linux-x86-64.so.2 [0] 185312 binding file /usr/local/lib64/libglib-2.0.so.0 [0] to /lib64/ld-linux-x86-64.so.2 [0]: normal symbol `calloc' [GLIBC_2.2.5] ... 210561 calling init: /usr/local/lib64/libglib-2.0.so.0 210614 symbol=calloc; lookup in file=./libsomething.so [0] 210615 symbol=calloc; lookup in file=/lib64/ld-linux-x86-64.so.2 [0] 210616 binding file /usr/local/lib64/libglib-2.0.so.0 [0] to /lib64/ld-linux-x86-64.so.2 [0]: normal symbol `calloc' [GLIBC_2.2.5] Suggested solution: ld-linux-x86-64.so shouldn't export symbols that exist in libc.so.6 (okay, they are 'Weak symbols' but that didn't prevent this problem); but if it does, its 'calloc' implementation call do zero the memory it returns.
You will have to adjust the link order. We can change this in the dynamic loader (and it may even be a useful change because it makes setting breakpoints on the real malloc easier), but I doubt that distributions would backport this, the current behavior will remain for the foreseeable future.
Created attachment 12241 [details] demo module for reproduction (C source) Compilation: gcc -I/usr/local/include/glib-2.0 -o demodule.so -shared \ demodule.c -fPIC -DPIC /lib64/ld-linux-x86-64.so.2 -lglib-2.0 Call-module in PHP: #!/usr/local/bin/php <?php $ffi = FFI::cdef('', './demodule.so'); ?> Test#1 $ php demodule.php demodule.init: calloc is at 0x7f6351b80b40 returned 0x2532df0 first 8 bytes: 7f634f22d698 Test#2: $ cat demodule.php | php demodule.init: calloc is at 0x7f6b6369eb40 returned 0x32b1f00 first 8 bytes: 0 It show that it is undeterminisctic: in the second test the memory happens to be zeroed, but that's only accidental not warranted.
(In reply to Florian Weimer from comment #1) Thank you for your quick reply -- I don't really like to be a proxy, but the original problem-owner has just reported that his shared object has been linked with Free Pascal, and he doesn't really know what should be changed.
On the other hand, if you don't change this in the dynamic linker, the current behavior will remain for good. Please examine the demo program I attached, and if you find it believable, please change the status to VERIFIED.
(In reply to Lorinczy Zsigmond from comment #2) > gcc -I/usr/local/include/glib-2.0 -o demodule.so -shared \ > demodule.c -fPIC -DPIC /lib64/ld-linux-x86-64.so.2 -lglib-2.0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ This usage of linking against the system ld.so is not correct. It's not clear what the purpose of this is, or why it's there. (In reply to Lorinczy Zsigmond from comment #4) > On the other hand, if you don't change this in the dynamic linker, the > current behavior will remain for good. Please examine the demo program I > attached, and if you find it believable, please change the status to > VERIFIED. It is a link order issue. You must link with libc.so first. If this is an issue with the Free Pascal compiler then please file a bug there. The dynamic loader is capable of exporting any interfaces it needs for the implementation, and it should always be linked to after the C library implementation. There is nothing wrong with the loader exporting this global symbols in the dynamic symbol table. As Florian notes it would be more robust to remove the symbols from the dynamic loader, but that's not something we're going to do right now. If you feel strongly about this robustness issue then we can rework this ticket to be an RFE for ld.so that includes removing interfaces to avoid accidental linkage. However, such an issue is very low priority and unlikley to get fixed unless somone has the explicit interest in fixing it. I appreciate the bug report, and posting here is the right way to get feedback on bugs like this. Thank you for the report.
(In reply to Carlos O'Donell from comment #5) Hi, thank you for your reply. I agree, the concrete problem wouldn't have occurred if it weren't for the wrong linkage. Also, I agree, the concrete problem has to be solved via re-linking the shared object in question. Also, I agree, quick action on glibc-side is neither needed or possible. Nonetheless, this problem (the original problem, not my demo) is occurred in the wild, it wasn't fabricated to make ld-linux look bad. Also google-search for +"dl-minimal.c" +"inconsitency detected" suggests it was not the first time. Also I don't see in the documentation "don't explicitly link shared objects against ld-linux-so or else bad things might happen (under some rare conditions)". So I think this 'calloc' in 'ld-linux' is a weak point which should be eventually removed (e.g. made it unexported). > If you feel strongly about this robustness issue > then we can rework this ticket to be an RFE for ld.so > that includes removing interfaces to avoid accidental linkage. > However, such an issue is very low priority and unlikely to get fixed > unless somone has the explicit interest in fixing it. That would be okay with me, thank you. Regards: Lorinczy Zsigmond
Reopening as RFE to cleanup ld.so exported malloc API symbols.
Patches posted: https://www.sourceware.org/ml/libc-alpha/2020-02/msg00176.html
The master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3a0ecccb599a6b1ad4b149dc569c0080e92d057b commit 3a0ecccb599a6b1ad4b149dc569c0080e92d057b Author: Florian Weimer <fweimer@redhat.com> Date: Sat Feb 8 19:58:43 2020 +0100 ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486] Exporting functions and relying on symbol interposition from libc.so makes the choice of implementation dependent on DT_NEEDED order, which is not what some compiler drivers expect. This commit replaces one magic mechanism (symbol interposition) with another one (preprocessor-/compiler-based redirection). This makes the hand-over from the minimal malloc to the full malloc more explicit. Removing the ABI symbols is backwards-compatible because libc.so is always in scope, and the dynamic loader will find the malloc-related symbols there since commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c ("ld.so: Support moving versioned symbols between sonames [BZ #24741]"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
I still have a question, though. Most applications compiled by fpc (the Free Pascal Compiler) do not depend on libc on Linux targets. But long time ago we discovered that when an application links to a library, it also has to link to libc or ld-linux. If you don't do this, it may lead to vague problems. That is why fpc-applications link with ld-linux. The statement that you never have to do this is simply not true. What do you suggest that we do, now? What we could do is: link with ld-linux only when libc is not used. This will probably solve the issue. But what would be the 'correct' solution, from your point of view?
Hi, I looked at fpc-SVN where these lines have been added; I think it means the shared object created by fpc has to depend on _some_ other shared object (doesn't matter what shared object), to prevent some problem during finalization. So it doesn't actually imports anything from ld-linux.so, it just needs a dependendent share-object. ------------------------------------------------------------------------------- r19036 | tom_at_work | 2011-09-08 23:17:35 +0200 (Thu, 08 Sep 2011) | 27 lines Fix shared library loading and unloading for Linux platforms. Shared library initialization and finalization are now called correctly at program startup for compile-time linked dynamic libraries on powerpc-/powerpc64-/arm-/i386- and x86_64-linux. Every startup code must now provide an additional entry point called "_dynamic_start" that is set as new the entry point if the program links to a Pascal shared library. Its purpose is to set up an exit hook usually passed via a register, which should be called during program finalization if non-nil. We use this additional entry point because this register only has meaningful content when there are any compile-time linked shared libraries, otherwise it often contains random garbage. The difference between the _dynamic_start and the original code is minimal; actually in all implementations the _dynamic_start code passes on control to the old startup code, so we use an additional entry point instead of an additional startup file.
I don't know enough about the fpc toolchain. Does it use binutils ld and the gcc compiler driver? Or does it implement its own compiler driver and linker? Does fpc link with the startup files? In general, there will be compatibility issues if you do not link with the startup files, and quote for r19036 suggests that this is what you are working around. But I suspect that fpc over-links against ld.so for no good reason. You will need a DT_NEEDED entry for libc.so.6 for correct ordering of initializers, but the DT_NEEDED entry for the dynamic linker is not required for this (and actually counterproductive for currently released glibc versions).
Well, fpc doesn't use gcc nor glibc (unless `USES initc` is used, which adds libso.6 as dependency); but it does use the platform-linker. It can create executables (when the source begins with PROGRAM), and shared objects (when the source begins with LIBRARY); the problem occurs when linking shared objects: an unnecessary dependecy to loader-linker is added.
This patch is also needed: https://www.sourceware.org/ml/libc-alpha/2020-02/msg00722.html
The master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=758599bc9dcc5764e862bd9e1613c5d1e6efc5d3 commit 758599bc9dcc5764e862bd9e1613c5d1e6efc5d3 Author: Florian Weimer <fweimer@redhat.com> Date: Wed Feb 26 15:58:23 2020 +0100 elf: Apply attribute_relro to pointers in elf/dl-minimal.c The present code leaves the function pointers unprotected, but moves some of the static functions into .data.rel.ro instead. This causes the linker to produce an allocatable, executable, writable section and eventually an RWX load segment. Not only do we really do not want that, it also breaks valgrind because valgrind does not load debuginfo from the mmap interceptor if all it sees are RX and RWX mappings. Fixes commit 3a0ecccb599a6b1ad4b149dc569c0080e92d057b ("ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]").