Bug 27772 - Inconsistency detected by ld.so: dl-fini.c: 88: _dl_fini: Assertion `ns != LM_ID_BASE || i == nloaded' failed!
Summary: Inconsistency detected by ld.so: dl-fini.c: 88: _dl_fini: Assertion `ns != LM...
Status: UNCONFIRMED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.3.3
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-23 20:38 UTC by BruceH
Modified: 2021-04-26 06:25 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Zip with two files: sample code to reproduce and patch diff (679 bytes, application/zip)
2021-04-23 20:38 UTC, BruceH
Details

Note You need to log in before you can comment on or make changes to this bug.
Description BruceH 2021-04-23 20:38:15 UTC
Created attachment 13397 [details]
Zip with two files: sample code to reproduce and patch diff

Overview:
The dynamic linker's finalizer crashes if it's not (directly or indirectly) in the library dependencies of the loaded application.

Steps to reproduce:
Compile and run a self-contained program calling the dynamic linker's finalizer (received in %rdx on x86-64 according to the ABI docs) before exit, with link options specifying that a dynamic linker shall be used, but without linking to something that pulls in the dynamic linker itself (be careful: libc.so.6 does), e.g. with these GCC options: -nostdlib -Wl,-dynamic-linker=/lib64/ld-linux-x86-64.so.2

# filename: dl_fini_assert_nloaded_test.S

#include <asm/unistd_64.h>
    .text
    .globl    _start
    .type    _start, @function
_start:
    test %rdx, %rdx    # this register is where the dynamic linker's finalizer
            # address is passed, can be $0 if none exists
    jz .no_dl_fini
    call *%rdx
.no_dl_fini:
    mov $0, %edi
    mov __NR_exit, %eax
    syscall

Actual Results:
Crash, with the assertion failure message listed above.

Expected Results:
Clean exit.

Build Date & Hardware:
2021-04-21 on Arch Linux
- glibc version: 2.33 (as tested; should affect every version since 2.2)
- kernel: Linux 5.11 (as tested; should not matter for the glibc code, but my test case assumes Linux for the exit syscall and for the linker options)
- architecture: x86-64 (as tested; should not matter for the glibc code, but my assembly test case clearly needs it)
- compiler and linker versions: GCC 10.2.0, ld 2.36.1 (as tested; shouldn't matter for the glibc code, but my test case is written with GCC in mind)

Additional Information:
I tracked this issue down to where the rtld function dl_main temporarily adds the rtld itself to the list of loaded libraries, then removes it later if it's not in the dependencies of other libraries loaded in the meantime. libc.so.6 does have such a dependency, so this code path is rarely used, because almost all software uses the shared version of the C standard library in some way or is statically linked (in the latter case, the rtld generally isn't used to launch the software in the first place). The bug is simply that during removal the counter for loaded objects isn't decremented. This mismatch then gets noticed when the rtld finalizer is called, triggering an assertion.

The bug exists since commit 1ebba33ece5a998d3d79fa14adca3ae7985cbff5 made way back in August 2000, which introduced this counter. There are sporadic occurrences of the crash message on the web, e.g. in the FreePascal bug tracker, but apparently nobody went the extra mile to properly report or fix it.

# filename: dl_fini_assert_nloaded_fix.diff

diff --git a/elf/rtld.c b/elf/rtld.c
index 94a00e2049..849a449e77 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -2007,6 +2007,8 @@ dl_main (const ElfW(Phdr) *phdr,
       GL(dl_rtld_map).l_next->l_prev = &GL(dl_rtld_map);
     }
     }
+  else
+    --GL(dl_ns)[LM_ID_BASE]._ns_nloaded;
 
   /* Now let us see whether all libraries are available in the
      versions we need.  */
Comment 1 Florian Weimer 2021-04-26 06:25:27 UTC
We should probably delete the code that removes ld.so link map if it is not needed. The entire reordering should no longer be necessarily once bug 25486 is fully fixed (which points to a different way of fixing bug 27744).

You really need to use the glibc-supplied startup files when linking, there is no way around that.