Bug 31208 - strip with no arguments sometimes breaks ELF alignment requirements
Summary: strip with no arguments sometimes breaks ELF alignment requirements
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.42
: P2 normal
Target Milestone: 2.43
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-02 22:19 UTC by Matt Wozniski
Modified: 2024-02-08 21:36 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
A patch to prevent this issue by dropping the unneeded PT_LOAD segment (602 bytes, application/mbox)
2024-01-02 22:19 UTC, Matt Wozniski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Wozniski 2024-01-02 22:19:28 UTC
Created attachment 15278 [details]
A patch to prevent this issue by dropping the unneeded PT_LOAD segment

Reproducer in the form of a published shared library (part of a Python package):

    mkdir /tmp/strip-bug
    cd /tmp/strip-bug
    wget https://files.pythonhosted.org/packages/33/6d/bc85b76c79db078597e057a1022b9e5eadebb083840a2942d0cdd0100cb7/memray-1.11.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    unzip memray-1.11.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    ldd memray/_memray.cpython-38-x86_64-linux-gnu.so
    strip memray/_memray.cpython-38-x86_64-linux-gnu.so
    ldd memray/_memray.cpython-38-x86_64-linux-gnu.so

The first call to `ldd` will output:

        linux-vdso.so.1 (0x00007ffc3583b000)
        liblz4-c29043df.so.1.7.1 => /tmp/strip-bug/memray/../memray.libs/liblz4-c29043df.so.1.7.1 (0x00007f2898b7e000)
        libunwind-92483c07.so.8.0.1 => /tmp/strip-bug/memray/../memray.libs/libunwind-92483c07.so.8.0.1 (0x00007f2898964000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f289894d000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f289876b000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f289861c000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f28985ff000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f28985dc000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f28983ea000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f28990a7000)

The second call, after stripping, will output:

        not a dynamic executable

The issue appears to be caused by segment 8 (0-based, as reported by `readelf -eW`). For the original shared library, that segment is reported as:

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x878000 0x00000000002ff000 0x00000000002ff000 0x00d498 0x00d498 RW  0x1000

After `strip`, that segment shows up as:

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x0f6cfc 0x00000000002ff000 0x00000000002ff000 0x000000 0x000000 RW  0x1000

The file size and memory size have both been dropped to 0, but the alignment was not dropped from 0x1000 to 0x1, and so the assigned offset of 0x0f6cfc is incompatible with the declared alignment. This occurs even when compiled from `master`.

A patch is attached showing a possible solution to this issue.
Comment 1 Matt Wozniski 2024-01-04 21:04:08 UTC
This may indicate that the fix for https://sourceware.org/bugzilla/show_bug.cgi?id=25237 was insufficient to address all cases.
Comment 2 Sam James 2024-01-05 09:59:56 UTC
Could you send the patch to the binutils ML please? Thank you!
Comment 3 Alan Modra 2024-01-06 06:05:11 UTC
(In reply to Matt Wozniski from comment #0)
> The issue appears to be caused by segment 8 (0-based, as reported by
> `readelf -eW`). For the original shared library, that segment is reported as:
> 
>   Type           Offset   VirtAddr           PhysAddr           FileSiz 
> MemSiz   Flg Align
>   LOAD           0x878000 0x00000000002ff000 0x00000000002ff000 0x00d498
> 0x00d498 RW  0x1000

So how did that load segment get there?  It doesn't correspond to any sections, but will be loading something into memory (a bunch of 'X's ie. 0x58 bytes by the look of it).  Is that used by the .so?  Perhaps the correct fix for strip is to refuse to edit this file.
Comment 4 Matt Wozniski 2024-01-08 21:07:37 UTC
> So how did that load segment get there?

I strongly suspect that it's the result of the ELF editing performed by `auditwheel repair`, from https://github.com/pypa/auditwheel - though I haven't dug in to confirm that for certain.

The error message that you get if you try to dlopen() the shared library after having called `strip` on it is:

    ELF load command address/offset not properly aligned

There's a lot of reports of that error message on Google for Python packages that were packaged by `auditwheel repair`:

https://github.com/linuxdeploy/linuxdeploy/issues/204
https://github.com/marcelotduarte/cx_Freeze/issues/1048
https://github.com/scipy/scipy/issues/17438

etc, which certainly seems to point a finger in the direction of the common Python packaging tools. Idiomatic pre-compiled Python libraries vendor their shared library dependencies, and `auditwheel repair` is the tooling used to do so (including renaming the vendored shared libraries to avoid version collisions between Python packages, and editing the shared libraries themselves to update the SONAME and DT_NEEDED to match the renamed libraries).
Comment 5 Sourceware Commits 2024-02-08 20:54:31 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7f26d260ef76a4cb2873a7815bef187005528c19

commit 7f26d260ef76a4cb2873a7815bef187005528c19
Author: Alan Modra <amodra@gmail.com>
Date:   Fri Feb 9 07:04:22 2024 +1030

    PR31208, strip can break ELF alignment requirements
    
    In https://sourceware.org/pipermail/binutils/2007-August/053261.html
    (git commit 3dea8fca8b86) I disabled a then new linker feature that
    removed empty PT_LOAD headers in cases where a user specified program
    headers, and for objcopy.  This can be a problem for objcopy/strip and
    since objcopy operates on sections, any part of a PT_LOAD loading file
    contents not covered by a section will be omitted anyway.
    
            PR 31208
            * elf.c (_bfd_elf_map_sections_to_segments): Pass remove_empty_load
            as true to elf_modify_segment_map for objcopy/strip.
Comment 6 Alan Modra 2024-02-08 20:56:42 UTC
Fixed.
Comment 7 Sourceware Commits 2024-02-08 21:36:42 UTC
The binutils-2_42-branch branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=78f9e9faaa41d628170f6047c3e032a67f9e829d

commit 78f9e9faaa41d628170f6047c3e032a67f9e829d
Author: Alan Modra <amodra@gmail.com>
Date:   Fri Feb 9 07:04:22 2024 +1030

    PR31208, strip can break ELF alignment requirements
    
    In https://sourceware.org/pipermail/binutils/2007-August/053261.html
    (git commit 3dea8fca8b86) I disabled a then new linker feature that
    removed empty PT_LOAD headers in cases where a user specified program
    headers, and for objcopy.  This can be a problem for objcopy/strip and
    since objcopy operates on sections, any part of a PT_LOAD loading file
    contents not covered by a section will be omitted anyway.
    
            PR 31208
            * elf.c (_bfd_elf_map_sections_to_segments): Pass remove_empty_load
            as true to elf_modify_segment_map for objcopy/strip.
    
    (cherry picked from commit 7f26d260ef76a4cb2873a7815bef187005528c19)