Bug 31504 - debugedit writes out ELF file even when nothing has been updated
Summary: debugedit writes out ELF file even when nothing has been updated
Status: RESOLVED FIXED
Alias: None
Product: debugedit
Classification: Unclassified
Component: debugedit (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-17 14:46 UTC by Mark Wielaard
Modified: 2024-05-15 11:34 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2024-03-17 14:46:46 UTC
See also https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/issues/19#note_171380 (warning javascript)

Basically what happens is that these .net ELF files are not really ELF files, but are ELF files with some extra data tagged on. Since libelf doesn't know anything about this extra data in the file it will simply truncate it after all known ELF data structures are written out. This is normally harmless because elf_update ELF_C_WRITE doesn't really write out anything that hasn't changed. But in this case it does.

What we could do at the end of main () is to check needs_update, and probably all needs_xxx_update flags, and simply not call elf_update if all of them are false.

Note that this doesn't help if anything does really (need) updates. This is really not a valid ELF file. But it will help if someone only wants the build-id or file list.
Comment 2 Allan McRae 2024-03-21 00:05:49 UTC
This partially fixes the issue we are seeing in Arch Linux.  

Debugedit keeps the extra data when using:

LANG=C debugedit --no-recompute-build-id \
	--list-file /dev/stdout "$1"

But still removes the data when using:

LANG=C debugedit --no-recompute-build-id \
	--base-dir "${srcdir}" \
	--dest-dir "${dbgsrcdir}" \
	--list-file /dev/stdout "$1"

This happens even when there are no source files found, so debugedit would make no changes.
Comment 3 Mark Wielaard 2024-03-21 11:46:38 UTC
(In reply to Allan McRae from comment #2)
> This partially fixes the issue we are seeing in Arch Linux.  
> 
> Debugedit keeps the extra data when using:
> 
> LANG=C debugedit --no-recompute-build-id \
> 	--list-file /dev/stdout "$1"

Thanks for testing.

> But still removes the data when using:
> 
> LANG=C debugedit --no-recompute-build-id \
> 	--base-dir "${srcdir}" \
> 	--dest-dir "${dbgsrcdir}" \
> 	--list-file /dev/stdout "$1"

Right. That is kind of expected since you are explicitly asking with --dest-dir to rewrite the DWARF debuginfo.
 
> This happens even when there are no source files found, so debugedit would
> make no changes.

I can add a check to see if there actually were any changes and if there aren't simply not call elf_update at all. But I suspect that is really a corner case that doesn't occur that often. At least why would you call debugedit explicitly asking to rewrite things and then expect it to not actually do that?
Comment 5 Allan McRae 2024-03-21 21:23:43 UTC
> I can add a check to see if there actually were any changes and if there aren't simply not call elf_update at all. But I suspect that is really a corner case that doesn't occur that often. At least why would you call debugedit explicitly asking to rewrite things and then expect it to not actually do that?

Our packaging system checks for ELF files and tries generating associated debug packages with source files and debug symbols.  It turns out there are a lot of packages that provide a .NET 8.0 self-contained/ single-file application, which is and ELF file and did not enjoy being processed by debugedit.

The v2 patch fixes the issue we observed.
Comment 6 Mark Wielaard 2024-03-22 12:29:00 UTC
(In reply to Allan McRae from comment #5)
> > I can add a check to see if there actually were any changes and if there aren't simply not call elf_update at all. But I suspect that is really a corner case that doesn't occur that often. At least why would you call debugedit explicitly asking to rewrite things and then expect it to not actually do that?
> 
> Our packaging system checks for ELF files and tries generating associated
> debug packages with source files and debug symbols.  It turns out there are
> a lot of packages that provide a .NET 8.0 self-contained/ single-file
> application, which is and ELF file and did not enjoy being processed by
> debugedit.

I understand that. And I think these aren't really "ELF" files because they add something to the file that isn't described by the ELF structures. What I don't fully understand is why you are expecting debugedit to NOT change the debug path strings when you are asking it to. Is this because there file don't actually contain any .debug sections?

> The v2 patch fixes the issue we observed.

Thanks for double checking. Pushed:

commit 6dd28bb30320e5236b3b5f79b6b2b41d2b2317bd
Author: Mark Wielaard <mark@klomp.org>
Date:   Mon Mar 18 23:37:47 2024 +0100

    debugedit: Only write the ELF file when updating strings or build-id
    
    Only open the ELF file read/write and write out the data if we
    actually did any elf structure update or updating the build-id.
    
            * tools/debugedit.c (fdopen_dso): Call elf_begin with
            ELF_C_READ when not changing dest_dir or build_id,
            otherwise use ELF_C_RDWR.
            (main): Call open with O_RDONLY when not changing dest_dir
            or build_id, otherwise use O_RDWR. Call elf_update with
            ELF_C_WRITE when rewriting any string, updating any ELF
            structure or build_id.
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=31504
    
    Signed-off-by: Mark Wielaard <mark@klomp.org>
Comment 7 Allan McRae 2024-03-22 22:13:28 UTC
> What I don't fully understand is why you are expecting debugedit to NOT change
> the debug path strings when you are asking it to. Is this because there file 
> don't actually contain any .debug sections?

I don't expect that.  I expect debugedit/libelf to not truncate the extra data that is tagged on to the ELF file.
Comment 8 Mark Wielaard 2024-03-22 23:47:07 UTC
(In reply to Allan McRae from comment #7)
> > What I don't fully understand is why you are expecting debugedit to NOT change
> > the debug path strings when you are asking it to. Is this because there file 
> > don't actually contain any .debug sections?
> 
> I don't expect that.  I expect debugedit/libelf to not truncate the extra
> data that is tagged on to the ELF file.

I am sorry, but it will if it needs to change the ELF data because it has no way of knowing what to do with this "extra data" since it isn't described in the ELF header, program or section tables. So when the debug path strings change and the section data becomes bigger or smaller things will move around.
Comment 9 Jamin Collins 2024-04-29 19:59:56 UTC
why not simply have debugedit detect (or watch) whether it has accounted for the full file contents.  If it has, great, do its thing.  If it has not, leave the file untouched/altered.

If in the course of doing its work, debugedit does not account for the full file contents, then the file is (as has been indicated) not a proper spec conforming ELF file.  As such, debugedit (and other ELF tools) should probably leave the file as-is, unless explicitly told otherwise (perhaps an additional flag).
Comment 10 Mark Wielaard 2024-05-15 11:34:15 UTC
(In reply to Jamin Collins from comment #9)
> why not simply have debugedit detect (or watch) whether it has accounted for
> the full file contents.  If it has, great, do its thing.  If it has not,
> leave the file untouched/altered.
> 
> If in the course of doing its work, debugedit does not account for the full
> file contents, then the file is (as has been indicated) not a proper spec
> conforming ELF file.  As such, debugedit (and other ELF tools) should
> probably leave the file as-is, unless explicitly told otherwise (perhaps an
> additional flag).

debugedit simply uses (elfutils) libelf. And libelf cannot really know whether the "gaps" in the file are intentional or not. There are different structures (section headers, program headers, section data) which can appear "randomly" in the file, it isn't simply a stream. Only the start of the file (the ELF header) is fixed, everything else can appear at some later point in the file in no particular order and there can be gaps.