Bug 29389 - pe renaming implib breaks bfd/cache.c reopening of closed archives
Summary: pe renaming implib breaks bfd/cache.c reopening of closed archives
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.39
: P2 normal
Target Milestone: ---
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-20 14:45 UTC by Luca Bacci
Modified: 2022-08-04 10:22 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-08-02 00:00:00


Attachments
Bundle with all the object files and static library archives (6.97 MB, application/zip)
2022-07-22 14:12 UTC, Luca Bacci
Details
Log of all the calls to _bfd_coff_link_input_bfd (31 bytes, text/plain)
2022-07-30 13:20 UTC, Luca Bacci
Details
Proposed Patch (1.43 KB, patch)
2022-08-02 11:05 UTC, Nick Clifton
Details | Diff
Alternative patch (2.11 KB, patch)
2022-08-03 06:44 UTC, Alan Modra
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Luca Bacci 2022-07-20 14:45:47 UTC
Hi! First of all, thanks for developing such great tools!

We've encountered an issue while building a few projects on an up-to-date MSYS2 MinGW64 envirnoment. What happens is that during the final link stage of executables, ld.exe hits two failed assertions and exits:

ld.exe: warning: subprojects/glib/gobject/libgobject-2.0.dll.a(libgobject_2_0_0_dll_d000431.o): local symbol `0' has no section
ld.exe: BFD (GNU Binutils) 2.38.90.20220720 assertion fail ../../binutils-gdb/bfd/cofflink.c:2279
ld.exe: BFD (GNU Binutils) 2.38.90.20220720 assertion fail ../../binutils-gdb/bfd/coff-x86_64.c:696
ld.exe: subprojects/glib/gobject/libgobject-2.0.dll.a(libgobject_2_0_0_dll_d000431.o): bad reloc address 0x126 in section `.text'

That happens both with the current stable binutils release (version 2.38) and a custom build from binutils-2_39-branch.

Reproduction steps:

* Install the needed packages with the command: pacman -S --needed git mingw-w64-x86_64-{cc,meson,libadwaita,gtksourceview5,pkgconf,gobject-introspection}
* Open the MSYS2 MinGW64 Shell (or alternatively the MSYS2 CLang64 Shell)
* Type the following command:
  git clone https://gitlab.gnome.org/GNOME/gnome-text-editor
  cd gnome-text-editor
  meson setup builddir -Dforce_fallback_for=libadwaita && meson compile -C builddir/

Issue originally reported at https://gitlab.gnome.org/GNOME/gtk/-/issues/5053

I remain at your disposal for any needed information or testing.
Thank you!
Comment 1 Alan Modra 2022-07-22 03:39:27 UTC
See my comment https://sourceware.org/pipermail/binutils/2022-July/121966.html
Comment 2 Luca Bacci 2022-07-22 08:03:46 UTC
Hello, Alan!

I was about to prepare a bundle with all the object files, while a stumbled upon this message:

$ cp subprojects/gtk/gtk/compose/compose-parse.exe.p/compose-parse.c.obj "subprojects/gtk/gtk/libgtk.a" "subprojects/gtk/gtk/css/libgtk_css.a" "subprojects/glib/glib/libglib-2.0.dll.a" "subprojects/glib/gobject/libgobject-2.0.dll.a" "subprojects/glib/gio/libgio-2.0.dll.a" "subprojects/glib/gmodule/libgmodule-2.0.dll.a" "subprojects/gtk/gdk/libgdk.a" "subprojects/gtk/gdk/win32/libgdk-win32.a" "subprojects/gtk/gsk/libgsk.a" "subprojects/gtk/gsk/libgsk_f16c.a" "D:/msys64/mingw64/lib/libpangocairo-1.0.dll.a" "D:/msys64/mingw64/lib/libpango-1.0.dll.a" "D:/msys64/mingw64/lib/libgobject-2.0.dll.a" "D:/msys64/mingw64/lib/libglib-2.0.dll.a" "D:/msys64/mingw64/lib/libintl.dll.a" "D:/msys64/mingw64/lib/libharfbuzz.dll.a" "D:/msys64/mingw64/lib/libcairo.dll.a" "D:/msys64/mingw64/lib/libfribidi.dll.a" "D:/msys64/mingw64/lib/libcairo-gobject.dll.a" "D:/msys64/mingw64/lib/libgdk_pixbuf-2.0.dll.a" "D:/msys64/mingw64/lib/libepoxy.dll.a" "D:/msys64/mingw64/lib/libgraphene-1.0.dll.a" "D:/msys64/mingw64/lib/libpangowin32-1.0.dll.a" "D:/msys64/mingw64/lib/libpangoft2-1.0.dll.a" "D:/msys64/mingw64/lib/libfontconfig.dll.a" "D:/msys64/mingw64/lib/libfreetype.dll.a" "D:/msys64/mingw64/lib/libpng16.dll.a" "D:/msys64/mingw64/lib/libz.dll.a" "D:/msys64/mingw64/lib/libtiff.dll.a" "D:/msys64/mingw64/lib/libjpeg.dll.a" "D:/msys64/mingw64/lib/libcairo-script-interpreter.dll.a" objs/
cp: will not overwrite just-created 'objs/libgobject-2.0.dll.a' with 'D:/msys64/mingw64/lib/libgobject-2.0.dll.a'
cp: will not overwrite just-created 'objs/libglib-2.0.dll.a' with 'D:/msys64/mingw64/lib/libglib-2.0.dll.a'

Turns out there are repeated input files: subprojects/glib/gobject/libgobject-2.0.dll.a and D:/msys64/mingw64/lib/libgobject-2.0.dll.a (and same for libglib-2.0.dll.a)

Removing D:/msys64/mingw64/lib/libgobject-2.0.dll.a and D:/msys64/mingw64/lib/libglib-2.0.dll.a from the command-line fixed the issue!
Comment 3 Luca Bacci 2022-07-22 14:12:20 UTC
Created attachment 14225 [details]
Bundle with all the object files and static library archives

Bundle containing all the needed object files and static library archives
Comment 4 Luca Bacci 2022-07-22 14:12:51 UTC
To link the final executable:

x86_64-w64-mingw32-cc -o compose-parse.exe compose-parse.c.obj "-Wl,--allow-shlib-undefined" "-Wl,--start-group" "libgtk.a" "libgtk_css.a" "libglib-2.0.dll.a" "libgobject-2.0.dll.a" "libgio-2.0.dll.a" "libgmodule-2.0.dll.a" "libgdk.a" "libgdk-win32.a" "libgsk.a" "libgsk_f16c.a" "-Wl,-Bsymbolic" "system/libpangocairo-1.0.dll.a" "system/libpango-1.0.dll.a" "system/libgobject-2.0.dll.a" "system/libglib-2.0.dll.a" "system/libintl.dll.a" "system/libharfbuzz.dll.a" "system/libcairo.dll.a" "system/libfribidi.dll.a" "system/libcairo-gobject.dll.a" "system/libgdk_pixbuf-2.0.dll.a" "system/libepoxy.dll.a" "-lm" "system/libgraphene-1.0.dll.a" "system/libpangowin32-1.0.dll.a" "-ladvapi32" "-lcomctl32" "-lcrypt32" "-ldwmapi" "-limm32" "-lsetupapi" "-lwinmm" "system/libpangoft2-1.0.dll.a" "system/libfontconfig.dll.a" "system/libfreetype.dll.a" "system/libpng16.dll.a" "system/libz.dll.a" "system/libtiff.dll.a" "system/libjpeg.dll.a" "-lhid" "system/libcairo-script-interpreter.dll.a" "-ladvapi32" "-lcomctl32" "-lcrypt32" "-ldwmapi" "-limm32" "-lsetupapi" "-lwinmm" "-lhid" "-Wl,--subsystem,console" "-lkernel32" "-luser32" "-lgdi32" "-lwinspool" "-lshell32" "-lole32" "-loleaut32" "-luuid" "-lcomdlg32" "-Wl,--end-group"

It completes successfully on my Arch Linux box, but it fails on MSYS2 MINGW64. Perhaps on Arch Linux the mingw-w64 toolchain does not use ld.bfd?
Comment 5 Luca Bacci 2022-07-22 14:20:17 UTC
Here's a backtrace when hitting the failed assertion in cofflink.c:2279

(gdb) bt
#0  _bfd_coff_link_input_bfd (flaginfo=0xfffb40, input_bfd=0xb3f2710)
    at ../../binutils-gdb/bfd/cofflink.c:2283
#1  0x00007ff71e814ee8 in _bfd_coff_final_link (abfd=0x4e662a0,
    info=0x7ff71ead7880 <link_info>) at ../../binutils-gdb/bfd/cofflink.c:866
#2  0x00007ff71e7b41fd in ldwrite () at ../../binutils-gdb/ld/ldwrite.c:545
#3  0x00007ff71e7b0b8b in main (argc=79, argv=0x11b5890)
    at ../../binutils-gdb/ld/ldmain.c:513

----------------------------------------------

(gdb) bt -full
#0  _bfd_coff_link_input_bfd (flaginfo=0xfffb40, input_bfd=0xb3f2710)
    at ../../binutils-gdb/bfd/cofflink.c:2283
        a = 0
        pos = 26247420
        amt = 144
        n_tmask = 48
        n_btshft = 4
        adjust_symndx = 0x0
        output_bfd = 0x4e662a0
        strings = 0x0
        syment_base = 56846
        copy = false
        hash = true
        isymesz = 18
        osymesz = 18
        linesz = 6
        esym = 0x161193c0 '▒' <repeats 16 times>
        esym_end = 0x161193c0 '▒' <repeats 16 times>
        isymp = 0x112c3030
        secpp = 0x10c51320
        indexp = 0x10c47f58
        output_index = 56952
        outsym = 0x10c71060 ""
        sym_hash = 0xb3dbec0
        o = 0x0
        __PRETTY_FUNCTION__ = "_bfd_coff_link_input_bfd"
#1  0x00007ff71e814ee8 in _bfd_coff_final_link (abfd=0x4e662a0,
    info=0x7ff71ead7880 <link_info>) at ../../binutils-gdb/bfd/cofflink.c:866
        symesz = 18
        flaginfo = {info = 0x7ff71ead7880 <link_info>,
          output_bfd = 0x4e662a0, failed = false, global_to_static = false,
          strtab = 0x10a91e50, section_info = 0x0, last_file_index = 55524,
          last_file = {_n = {_n_name = ".file\000\000", _n_n = {
                _n_zeroes = 435610543662, _n_offset = 13451671603782742029},
              _n_nptr = {
                0x656c69662e <error: Cannot access memory at address 0x656c69662e>,
                0xbaadf00dbaadf00d <error: Cannot access memory at address 0xbaadf00dbaadf00d>}}, n_value = 55621, n_scnum = -2, n_flags = 61453, n_type = 0,
            n_sclass = 103 'g', n_numaux = 1 '\001'}, last_bf_index = -1,
          last_bf = {x_sym = {x_tagndx = {l = 16776160, p = 0xfffbe0},
              x_misc = {x_lnsz = {x_lnno = 4236, x_size = 7807},
                x_fsize = 511643788}, x_fcnary = {x_fcn = {
                  x_lnnoptr = 82207392, x_endndx = {l = 48, p = 0x30}},
                x_ary = {x_dimen = {25248, 1254, 0, 0}}}, x_tvndx = 52223},
            x_file = {x_n = {
                x_fname = "▒▒▒\000\000\000\000\000▒\020\177\036▒\177\000\000▒b▒\004", x_n = {x_zeroes = 16776160, x_offset = 140699345293452}},
              x_ftype = 48 '0'}, x_scn = {x_scnlen = 16776160, x_nreloc = 0,
              x_nlinno = 0, x_checksum = 511643788, x_associated = 32759,
              x_comdat = 0 '\000'}, x_tv = {x_tvfill = 16776160, x_tvlen = 0,
              x_tvran = {0, 4236}}, x_csect = {x_scnlen = {l = 16776160,
                p = 0xfffbe0}, x_parmhash = 511643788, x_snhash = 32759,
              x_smtyp = 0 '\000', x_smclas = 0 '\000', x_stab = 82207392,
              x_snstab = 0}, x_sect = {x_scnlen = 16776160, x_nreloc = 0}},
          debug_merge = {root = {table = 0x10c3eed0,
              newfunc = 0x7ff71e8132be <_bfd_coff_debug_merge_hash_newfunc>,
              memory = 0x1123bda0, size = 4051, count = 0, entsize = 32,
              frozen = 0}}, internal_syms = 0x112c1fa0,
          sec_ptrs = 0x10c50fd0, sym_indices = 0x10c47db0,
          outsyms = 0x10c70fd0 "0\001",
          linenos = 0xca80fe0 "`", '▒' <repeats 16 times>, "▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒",
          contents = 0x12255040 "\001", external_relocs = 0x10c5d090 "",
          internal_relocs = 0x10c90fa0}
        debug_merge_allocated = true
        long_section_names = true
        o = 0x4e67540
        p = 0x10b2da40
        max_sym_count = 6161
        max_lineno_count = 0
        max_reloc_count = 3828
        max_output_reloc_count = 0
        max_contents_size = 1520608
        rel_filepos = 25224192
        relsz = 10
        line_filepos = 25224192
        linesz = 6
        sub = 0xb3f2710
        external_relocs = 0x0
        strbuf = "\000\000\000"
        amt = 91872
#2  0x00007ff71e7b41fd in ldwrite () at ../../binutils-gdb/ld/ldwrite.c:545
No locals.
#3  0x00007ff71e7b0b8b in main (argc=79, argv=0x11b5890)
    at ../../binutils-gdb/ld/ldmain.c:513
        emulation = 0x7ff71e96cd8d <__PRETTY_FUNCTION__.0+1277> "i386pep"
        start_time = 0
(gdb)

----------------------------------------------
Comment 6 Alan Modra 2022-07-22 16:12:49 UTC
You are missing rather a lot of object files and libraries in that zip.  Besides the libraries you specify with -l, there are also some objects and libraries that your cc adds.  You can see those by adding -Wl,-v to the command in comment #4, or perhaps more conveniently by adding -Wl,-t.

It is quite likely that those other objects and libraries are different between your two systems, and why you say "It completes successfully on my Arch Linux box, but it fails on MSYS2 MINGW64."
Comment 7 Luca Bacci 2022-07-29 18:19:12 UTC
Thanks, Alan!

I'm going to make a bundle containing all the required object files, will attach it very soon.

Right now I have just tried debugging this issue in GDB a bit. What I could find is that when the warning "local symbol `0' has no section" is output by coff_classify_symbol (bfd *abfd, struct internal_syment *syment), syment points to freed memory:

Thread 1 hit Breakpoint 1, coff_classify_symbol (abfd=0x1a32e152710, syment=0x23121ff2f0) at ../../binutils-gdb/bfd/coffcode.h:5038
5038          _bfd_error_handler
(gdb) bt
#0  coff_classify_symbol (abfd=0x1a32e152710, syment=0x23121ff2f0)
    at ../../binutils-gdb/bfd/coffcode.h:5038
#1  0x00007ff6eda466a6 in _bfd_coff_link_input_bfd (flaginfo=0x23121ff700,
    input_bfd=0x1a32e152710) at ../../binutils-gdb/bfd/cofflink.c:1446
#2  0x00007ff6eda44ed8 in _bfd_coff_final_link (abfd=0x1a327bc62a0,
    info=0x7ff6edd07880 <link_info>) at ../../binutils-gdb/bfd/cofflink.c:868
#3  0x00007ff6ed9e41ed in ldwrite () at ../../binutils-gdb/ld/ldwrite.c:545
#4  0x00007ff6ed9e0b7b in main (argc=79, argv=0x1a326035890)
    at ../../binutils-gdb/ld/ldmain.c:513
(gdb) p *abfd
$1 = {filename = 0x1a32e13ac50 "libgobject_2_0_0_dll_d000431.o",
  xvec = 0x7ff6edbcf220 <x86_64_pe_vec>, iostream = 0x0,
  iovec = 0x7ff6edbb6280 <cache_iovec>, lru_prev = 0x0, lru_next = 0x0,
  where = 0, mtime = 0, id = 2121, flags = 32825, format = bfd_object,
  direction = read_direction, cacheable = 0, target_defaulted = 0,
  opened_once = 0, mtime_set = 0, no_export = 0, output_has_begun = 0,
  has_armap = 0, is_thin_archive = 0, no_element_cache = 0,
  selective_search = 0, is_linker_output = 0, is_linker_input = 1,
  plugin_format = bfd_plugin_unknown, lto_output = 0, lto_slim_object = 0,
  read_only = 0, plugin_dummy_bfd = 0x0, origin = 72736,
  proxy_origin = 72736, section_htab = {table = 0x1a32e1548a0,
    newfunc = 0x7ff6eda26cfe <bfd_section_hash_newfunc>,
    memory = 0x1a32d75c0d0, size = 4051, count = 5, entsize = 304,
    frozen = 0}, sections = 0x1a32e1538a8, section_last = 0x1a32e153d68,
  section_count = 5, archive_plugin_fd = -1,
  archive_plugin_fd_open_count = 0, archive_pass = 0, alloc_size = 6561,
  start_address = 0, outsymbols = 0x1a33339ae80, symcount = 8,
  dynsymcount = 0, arch_info = 0x7ff6edbe9040 <bfd_x86_64_arch>, size = 0,
  arelt_data = 0x1a32d75bfd0, my_archive = 0x1a32d75bd20, archive_next = 0x0,
  archive_head = 0x0, nested_archives = 0x0, link = {next = 0x1a32e12ff80,
    hash = 0x1a32e12ff80}, tdata = {aout_data = 0x1a32e13ac78,
    aout_ar_data = 0x1a32e13ac78, coff_obj_data = 0x1a32e13ac78,
    pe_obj_data = 0x1a32e13ac78, xcoff_obj_data = 0x1a32e13ac78,
    ecoff_obj_data = 0x1a32e13ac78, srec_data = 0x1a32e13ac78,
    verilog_data = 0x1a32e13ac78, ihex_data = 0x1a32e13ac78,
    tekhex_data = 0x1a32e13ac78, elf_obj_data = 0x1a32e13ac78,
    mmo_data = 0x1a32e13ac78, sun_core_data = 0x1a32e13ac78,
    sco5_core_data = 0x1a32e13ac78, trad_core_data = 0x1a32e13ac78,
    som_data = 0x1a32e13ac78, hpux_core_data = 0x1a32e13ac78,
    hppabsd_core_data = 0x1a32e13ac78, sgi_core_data = 0x1a32e13ac78,
    lynx_core_data = 0x1a32e13ac78, osf_core_data = 0x1a32e13ac78,
    cisco_core_data = 0x1a32e13ac78, versados_data = 0x1a32e13ac78,
    netbsd_core_data = 0x1a32e13ac78, mach_o_data = 0x1a32e13ac78,
    mach_o_fat_data = 0x1a32e13ac78, plugin_data = 0x1a32e13ac78,
    pef_data = 0x1a32e13ac78, pef_xlib_data = 0x1a32e13ac78,
    sym_data = 0x1a32e13ac78, any = 0x1a32e13ac78}, usrdata = 0x1a32e15ccf0,
  memory = 0x1a32d75bed0, build_id = 0x0}
(gdb) p *syment
$2 = {_n = {_n_name = "0\001\000\000\000\000\000", _n_n = {_n_zeroes = 304,
      _n_offset = 13451671603782742029}, _n_nptr = {
      0x130 <error: Cannot access memory at address 0x130>,
      0xbaadf00dbaadf00d <error: Cannot access memory at address 0xbaadf00dbaadf00d>}}, n_value = 1, n_scnum = 0, n_flags = 61453, n_type = 49200,
  n_sclass = 46 '.', n_numaux = 105 'i'}
(gdb)
Comment 8 Luca Bacci 2022-07-30 13:05:33 UTC
I have now more insights into what's going on...

The issue stems from passing repeated import libs to the linker:

    ld.bfd -o out ... subprojects/glib/gobject/libgobject-2.0.dll.a ... "D:/msys64/mingw64/lib/libgobject-2.0.dll.a" ...

The contents of the two import libs are very similar, of course, They both look like:

    ...
    !<arch>
    /               0           0     0     0       28890     `
    ... 
    libgobject_2_0_0_dll_d000481.o/
    libgobject_2_0_0_dll_d000481.o/
    libgobject_2_0_0_dll_d000009.o/
    libgobject_2_0_0_dll_d000480.o/
    libgobject_2_0_0_dll_d000479.o/
    libgobject_2_0_0_dll_d000478.o/
    libgobject_2_0_0_dll_d000477.o/
    libgobject_2_0_0_dll_d000476.o/
    libgobject_2_0_0_dll_d000475.o/
    libgobject_2_0_0_dll_d000474.o/
    libgobject_2_0_0_dll_d000473.o/
    libgobject_2_0_0_dll_d000472.o/
    libgobject_2_0_0_dll_d000471.o/
    libgobject_2_0_0_dll_d000470.o/
    libgobject_2_0_0_dll_d000469.o/
    libgobject_2_0_0_dll_d000468.o/
    libgobject_2_0_0_dll_d000467.o/
    libgobject_2_0_0_dll_d000466.o/
    libgobject_2_0_0_dll_d000465.o/
    ...

By tracing calls to _bfd_coff_link_input_bfd() we get:

    libgobject_2_0_0_dll_d000472.o
    libgobject_2_0_0_dll_d000471.o
    libgobject_2_0_0_dll_d000470.o
    libgobject_2_0_0_dll_d000469.o
    libgobject_2_0_0_dll_d000468.o
    libgobject_2_0_0_dll_d000467.o
    libgobject_2_0_0_dll_d000465.o
    libgobject_2_0_0_dll_d000464.o
    libgobject_2_0_0_dll_d000463.o
    libgobject_2_0_0_dll_d000462.o
    libgobject_2_0_0_dll_d000461.o
    libgobject_2_0_0_dll_d000460.o
    libgobject_2_0_0_dll_d000459.o
    libgobject_2_0_0_dll_d000457.o
    libgobject_2_0_0_dll_d000456.o
    libgobject_2_0_0_dll_d000454.o
    libgobject_2_0_0_dll_d000453.o
    libgobject_2_0_0_dll_d000451.o
    libgobject_2_0_0_dll_d000449.o
    libgobject_2_0_0_dll_d000448.o
    libgobject_2_0_0_dll_d000446.o
    libgobject_2_0_0_dll_d000445.o
    libgobject_2_0_0_dll_d000443.o
    libgobject_2_0_0_dll_d000442.o
    libgobject_2_0_0_dll_d000441.o
    libgobject_2_0_0_dll_d000440.o
    libgobject_2_0_0_dll_d000439.o
    libgobject_2_0_0_dll_d000436.o
    libgobject_2_0_0_dll_d000435.o
    libgobject_2_0_0_dll_d000434.o
    libgobject_2_0_0_dll_d000432.o
    libgobject_2_0_0_dll_d000431.o
    libgobject_2_0_0_dll_d000430.o
    libgobject_2_0_0_dll_d000429.o
    libgobject_2_0_0_dll_d000428.o
    libgobject_2_0_0_dll_d000427.o
    libgobject_2_0_0_dll_d000426.o
    libgobject_2_0_0_dll_d000425.o
    libgobject_2_0_0_dll_d000424.o
    libgobject_2_0_0_dll_d000423.o
    libgobject_2_0_0_dll_d000422.o
    libgobject_2_0_0_dll_d000421.o
    libgobject_2_0_0_dll_d000420.o
    libgobject_2_0_0_dll_d000419.o
    libgobject_2_0_0_dll_d000418.o
    libgobject_2_0_0_dll_d000417.o
    libgobject_2_0_0_dll_d000416.o
    libgobject_2_0_0_dll_d000415.o
    libgobject_2_0_0_dll_d000414.o
    libgobject_2_0_0_dll_d000413.o
    libgobject_2_0_0_dll_d000412.o
    libgobject_2_0_0_dll_d000411.o
    libgobject_2_0_0_dll_d000410.o
    libgobject_2_0_0_dll_d000409.o
    libgobject_2_0_0_dll_d000408.o
    libgobject_2_0_0_dll_d000407.o
    libgobject_2_0_0_dll_d000406.o
    libgobject_2_0_0_dll_d000405.o
    libgobject_2_0_0_dll_d000403.o
    libgobject_2_0_0_dll_d000402.o
    libgobject_2_0_0_dll_d000401.o
    libgobject_2_0_0_dll_d000385.o
    ...
    libgobject_2_0_0_dll_d000036.o
    libgobject_2_0_0_dll_d000035.o
    libgobject_2_0_0_dll_d000034.o
    libgobject_2_0_0_dll_d000033.o
    libgobject_2_0_0_dll_d000032.o
    libgobject_2_0_0_dll_d000030.o
    libgobject_2_0_0_dll_d000029.o
    libgobject_2_0_0_dll_d000028.o
    libgobject_2_0_0_dll_d000027.o
    libgobject_2_0_0_dll_d000013.o
    libgobject_2_0_0_dll_d000431.o <-- CRASH!

As you can see, the crash happens when executing _bfd_coff_link_input_bfd() the first time with a repeated input_bfd->filename. There we use already freed data.
Comment 9 Luca Bacci 2022-07-30 13:20:32 UTC
Created attachment 14245 [details]
Log of all the calls to _bfd_coff_link_input_bfd
Comment 10 Luca Bacci 2022-07-30 13:27:54 UTC
The code that repeatedly calls _bfd_coff_link_input_bfd() is in _bfd_coff_final_link(): https://github.com/bminor/binutils-gdb/blob/binutils-2_38/bfd/cofflink.c#L856

    for (o = abfd->sections; o != NULL; o = o->next)
      {
        for (p = o->map_head.link_order; p != NULL; p = p->next)
          {
            if (p->type == bfd_indirect_link_order
                && bfd_family_coff (p->u.indirect.section->owner))
              {
                sub = p->u.indirect.section->owner;
                if (! bfd_coff_link_output_has_begun (sub, & flaginfo))
                  {
                    if (! _bfd_coff_link_input_bfd (&flaginfo, sub))
                      goto error_return;
                    sub->output_has_begun = true;
                  }
              }
            else if (p->type == bfd_section_reloc_link_order
                     || p->type == bfd_symbol_reloc_link_order)
              {
                if (! _bfd_coff_reloc_link_order (abfd, &flaginfo, o, p))
                  goto error_return;
              }
            else
              {
                if (! _bfd_default_link_order (abfd, info, o, p))
                  goto error_return;
              }
          }
      }

But I don't quite know where the needed memory is freed. I see that hash maps are used, could it be that some hash maps are indexed by filenames?
Comment 11 Luca Bacci 2022-07-30 14:54:12 UTC
Ok, I can now reproduce the issue on Linux

1. Download the binutils-issue.zip archive: https://drive.google.com/file/d/17vAIVbxtBsubC0yjYiZClYejEwDTqJkA/view?usp=sharing
2. Extract the archive
3. Open a terminal and cd into the extracted folder
4. Launch the following command:

x86_64-w64-mingw32-ld.bfd -o compose-parse subprojects/gtk/gtk/compose/compose-parse.exe.p/compose-parse.c.obj --allow-shlib-undefined --start-group subprojects/{gtk/gtk/libgtk.a,gtk/gtk/css/libgtk_css.a,glib/glib/libglib-2.0.dll.a,glib/gobject/libgobject-2.0.dll.a,glib/gio/libgio-2.0.dll.a,glib/gmodule/libgmodule-2.0.dll.a,gtk/gdk/libgdk.a,gtk/gdk/win32/libgdk-win32.a,gtk/gsk/libgsk.a,gtk/gsk/libgsk_f16c.a} -Bsymbolic d/msys64/mingw64/lib/libintl.a d/msys64/mingw64/lib/{libpangocairo-1.0.dll.a,libpango-1.0.dll.a,libgobject-2.0.dll.a,libglib-2.0.dll.a,libintl.dll.a,libharfbuzz.dll.a,libcairo.dll.a,libfribidi.dll.a,libcairo-gobject.dll.a,libgdk_pixbuf-2.0.dll.a,libepoxy.dll.a} d/msys64/mingw64/lib/libm.a d/msys64/mingw64/lib/{libgraphene-1.0.dll.a,libpangowin32-1.0.dll.a} d/msys64/mingw64/lib/{libadvapi32.a,libcomctl32.a,libcrypt32.a,libdwmapi.a,libimm32.a,libsetupapi.a,libwinmm.a} d/msys64/mingw64/lib/{libpangoft2-1.0.dll.a,/libfontconfig.dll.a,libfreetype.dll.a} d/msys64/mingw64/lib/libintl.a d/msys64/mingw64/lib/{libpng16.dll.a,libz.dll.a,libtiff.dll.a,libjpeg.dll.a} d/msys64/mingw64/lib/libhid.a d/msys64/mingw64/lib/libcairo-script-interpreter.dll.a d/msys64/mingw64/lib/{libintl.a,libadvapi32.a,libcomctl32.a,libcrypt32.a,libdwmapi.a,libimm32.a,libsetupapi.a,libwinmm.a,libintl.a,libhid.a,libkernel32.a,libuser32.a,libgdi32.a,libwinspool.a,libshell32.a,libole32.a,liboleaut32.a,libuuid.a,libcomdlg32.a} "--end-group" d/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/12.1.0/{crtbegin.o,libgcc.a,libgcc_eh.a} d/msys64/mingw64/lib/{crt2.o,default-manifest.o,libmingw32.a,libmingwex.a,libmoldname.a,libmsvcrt.a} --subsystem=console

It doesn' complete due to a bunch of missing symbols, but that's not relevant. Now, with the help of gdb, break into _bfd_coff_link_input_bfd() when input_bfd->filename == "libgobject_2_0_0_dll_d000431.o". That will happen two times: the first time it happens everything is normal, the second time you will see that flaginfo points to invalid memory (print *flaginfo)
Comment 12 Luca Bacci 2022-07-30 15:00:23 UTC
Clarification: some fields of flaginfo point to invalid memory.
Comment 13 Alan Modra 2022-07-31 03:55:52 UTC
I don't see anything unusual when I link comment #11 objects.  Sure, the same file name is linked twice, but one is from subprojects/glib/gobject/libgobject-2.0.dll.a (defining g_value_init_from_instance) and the other is from d/msys64/mingw64/lib/libgobject-2.0.dll.a (defining g_value_register_transform_func).  valgrind tells me freed memory is not accessed.  The fact that some field of flaginfo points to memory that can't be accessed is nothing to get excited about: some fields point to structures that contain unions.
I don't see the assertion fail.
Comment 14 Luca Bacci 2022-08-01 16:13:12 UTC
True! I'm going to investigate more...
Comment 15 Luca Bacci 2022-08-01 16:55:15 UTC
Uh...how strange!

One of the very first things that _bfd_coff_link_input_bfd() does is call obj_coff_external_syms() which seeks into the lib file and reads the external symbols table.

When _bfd_coff_link_input_bfd() is run with input_bfd->filename == "libgobject_2_0_0_dll_d000431.o" for the second time, obj_coff_external_syms() is executed, which calls cache_bseek() (then _bfd_real_fseek(), then fseeko64()) with an offset of 73060, on both Linux and MSYS2.

However, on Linux the read buffer is:

    (gdb) p abfd->where
    $19 = 73060
    (gdb) p result
    $20 = 0
    (gdb) n
    405     in /usr/src/debug/binutils-2.38/bfd/bfdio.c
    (gdb) 
    _bfd_coff_get_external_symbols (abfd=0x55555bc67750) at /usr/src/debug/binutils-2.38/bfd/coffgen.c:1692
    1692    /usr/src/debug/binutils-2.38/bfd/coffgen.c: File o directory non esistente.
    (gdb) n
    1694    in /usr/src/debug/binutils-2.38/bfd/coffgen.c
    (gdb) n
    1695    in /usr/src/debug/binutils-2.38/bfd/coffgen.c
    (gdb) p syms
    $21 = (void *) 0x555561e3c000
    (gdb) x /150db syms
    0x555561e3c000: 46      116     101     120     116     0       0       0
    0x555561e3c008: 0       0       0       0       1       0       0       0
    0x555561e3c010: 3       0       46      105     100     97      116     97
    0x555561e3c018: 36      55      0       0       0       0       2       0
    0x555561e3c020: 0       0       3       0       46      105     100     97
    0x555561e3c028: 116     97      36      53      0       0       0       0
    0x555561e3c030: 3       0       0       0       3       0       46      105
    0x555561e3c038: 100     97      116     97      36      52      0       0
    0x555561e3c040: 0       0       4       0       0       0       3       0
    0x555561e3c048: 46      105     100     97      116     97      36      54
    0x555561e3c050: 0       0       0       0       5       0       0       0
    0x555561e3c058: 3       0       0       0       0       0       4       0
    0x555561e3c060: 0       0       0       0       0       0       1       0
    0x555561e3c068: 0       0       2       0       0       0       0       0
    0x555561e3c070: 36      0       0       0       0       0       0       0
    0x555561e3c078: 3       0       0       0       2       0       0       0
    0x555561e3c080: 0       0       74      0       0       0       0       0
    0x555561e3c088: 0       0       0       0       0       0       2       0
    0x555561e3c090: 0       0       0       0       0       0
    (gdb) 

While on MSYS2 the read buffer is:

    _bfd_coff_get_external_symbols (abfd=0x1d47d7c2710) at ../../binutils-gdb/bfd/coffgen.c:1596
    1596      if (bfd_seek (abfd, obj_sym_filepos (abfd), SEEK_SET) != 0)
    (gdb)
    1598      syms = _bfd_malloc_and_read (abfd, size, size);
    (gdb)
    1599      obj_coff_external_syms (abfd) = syms;
    (gdb) p size
    $9 = 144
    (gdb) x /150db syms
    0x1d40853a550:  48      1       0       0       0       0       0       0
    0x1d40853a558:  1       0       0       0       0       0       48      -64
    0x1d40853a560:  46      105     100     97      116     97      36      52
    0x1d40853a568:  0       0       0       0       0       0       0       0
    0x1d40853a570:  8       0       0       0       -16     0       0       0
    0x1d40853a578:  58      1       0       0       0       0       0       0
    0x1d40853a580:  1       0       0       0       0       0       48      -64
    0x1d40853a588:  46      105     100     97      116     97      36      54
    0x1d40853a590:  0       0       0       0       0       0       0       0
    0x1d40853a598:  36      0       0       0       -8      0       0       0
    0x1d40853a5a0:  0       0       0       0       0       0       0       0
    0x1d40853a5a8:  0       0       0       0       0       0       48      -64
    0x1d40853a5b0:  -1      37      0       0       0       0       -112    -112
    0x1d40853a5b8:  0       0       0       0       0       0       0       0
    0x1d40853a5c0:  0       0       0       0       0       0       0       0
    0x1d40853a5c8:  0       0       0       0       -88     1       103     95
    0x1d40853a5d0:  118     97      108     117     101     95      114     101
    0x1d40853a5d8:  103     105     115     116     101     114     95      116
    0x1d40853a5e0:  -85     -85     -85     -85     -85     -85

Using an hex editor we can see that on Linux the FILE* refers to d/msys64/mingw64/lib/libgobject-2.0.dll.a (which is right), on MSYS2 the FILE* refers to subprojects/glib/gobject/libgobject-2.0.dll.a (which is not right)
Comment 16 Luca Bacci 2022-08-01 21:04:39 UTC
On MSYS2, bfd/cache.c:bfd_cache_max_open() returns 10, while on Linux it returns 128 (because HAVE_GETRLIMIT is defined)

If I force the cache size on MSYS2 to a higher value, e.g. 500, then the issue goes away. On the countrary, if I set the cache size to 10 on Linux, then I can successfully reproduce the issue (using the instructions in https://sourceware.org/bugzilla/show_bug.cgi?id=29389#c11)

Hope it helps!
Thanks,
Luca
Comment 17 Luca Bacci 2022-08-01 21:06:17 UTC
Beyond that, would you accept a patch to increase the cache size on Windows? It could help in improving performance
Comment 18 Alan Modra 2022-08-02 00:40:29 UTC
(In reply to Luca Bacci from comment #16)
> If I force the cache size on MSYS2 to a higher value, e.g. 500, then the
> issue goes away. On the countrary, if I set the cache size to 10 on Linux,
> then I can successfully reproduce the issue
Ah ha! That's good detective work!  Now that it also reproduces for me, I should be able to find out why the wrong file is being reopened.

> would you accept a patch to increase the cache size on Windows?
Yes, host specific code to find the resource limit would be nice.
Comment 19 Alan Modra 2022-08-02 01:45:48 UTC
This code in ld/emultempl/pe.em is the cause.  (also in pep.em)

			/* Rename this implib to match the other one.  */
			if (!bfd_set_filename (is->the_bfd->my_archive,
					       other_bfd_filename))

You just can't do that when the filename is needed to reopen closed files.
Comment 20 Alan Modra 2022-08-02 04:09:43 UTC
Removing myself as assignee for this bug.  I just don't know enough about the PE linker support to fix this.  Clearly pe.em/pep.em can't write to the filename of an archive or object file (it can to an element of a non-thin archive!), so the code that is setting filename will need to set a new field in pe_tdata and that field used instead of the filename for mumble mumble mumble.  Yes, I don't know more than what the comment tells me, that this is done for import tables.
Comment 21 Nick Clifton 2022-08-02 11:05:36 UTC
Created attachment 14250 [details]
Proposed Patch

Hi Luca,

  Please could you try out the uploaded patch which *might* fix the problem.

  The patch does a few things.  Firstly it "fixes" bfd_set_filename() so that
  it will fail if the bfd has been closed by file caching.  Plus if the bfd
  is currently open, it will mark it as uncacheable, so that a future cache
  close/reopen sequence will not run into the problem you have encountered.

  Next the patch updates the calls to bfd_set_filename in the PE specific
  code, so that if they fail a slightly more helpful error message is
  generated.  (It is not really enough to help a user who is not familiar
  with the linker however.  Any suggestions for better wording would be
  gratefully appreciated).

  Finally it adds a (unused) call to bfd_stat() just before the call to
  bfd_set_filename(), so that the file will be reopened if it has been 
  cached.  (Arguably it would better if bfd_set_filename() just re-opened
  the file instead, rather than failing the rename.  That would eliminate
  the need for a bogus stat() call).

  Anyway, give it a whirl and see how it goes.  I am still testing the
  patch locally, so I may uncover some problems myself as well.

Cheers
  Nick
Comment 22 Luca Bacci 2022-08-02 13:59:23 UTC
Yes, I confirm that it's working now!

Luca
Comment 23 Alan Modra 2022-08-02 22:36:16 UTC
Nick. the reason why I couldn't figure out how the changed filename was being used by the PE support code is because it is no longer used.  A patch of mine, commit 678dc756a57 broke it.  Well, broke it some more.  I'll see about fixing the mess.
Comment 24 Alan Modra 2022-08-03 01:46:46 UTC
(In reply to Alan Modra from comment #23)
> Nick. the reason why I couldn't figure out how the changed filename was
> being used by the PE support code is because it is no longer used.  A patch
> of mine, commit 678dc756a57 broke it.  Well, broke it some more.  I'll see
> about fixing the mess.

Actually, no, the place where I thought the filename was escaping to is too late.  But I did figure out where the changed filenames are used, in the PE script!  SORT(*)(.idata$2) and similar sort by filename.

I think we can get the required sorting without messing with the filename.
Comment 25 Alan Modra 2022-08-03 06:44:41 UTC
Created attachment 14252 [details]
Alternative patch

What do you think of this one, Nick?  It's also possible to stuff the new sort key into a new field of tdata.pe_obj_data for objects and tdata.aout_ar_data->tdata for archives, but that's rather more messy.
Comment 26 Nick Clifton 2022-08-03 08:36:04 UTC
(In reply to Alan Modra from comment #25)
Hi Alan,

> What do you think of this one, Nick? 

I like it.  It is definitely a more robust solution to the original problem.

> It's also possible to stuff the new
> sort key into a new field of tdata.pe_obj_data for objects and
> tdata.aout_ar_data->tdata for archives, but that's rather more messy.

Agreed.

I also think that we should keep part of my original patch - the part that makes bfd_set_filename() fail if the bfd has been closed by the file cache, and which also clears the cacheable flag on bfds which are renamed.  Do you agree ?

Cheers
  Nick
Comment 27 Alan Modra 2022-08-03 09:27:26 UTC
(In reply to Nick Clifton from comment #26)
> I also think that we should keep part of my original patch - the part that
> makes bfd_set_filename() fail if the bfd has been closed by the file cache,
> and which also clears the cacheable flag on bfds which are renamed.  Do you
> agree ?

Yes, but have you checked uses of bfd_set_filename in gdb?  There are some that looked suspicious to me, for example, gdb/solib-darwin.c:darwin_bfd_open, but then on closer inpection I think they will be OK.
Comment 28 cvs-commit@gcc.gnu.org 2022-08-03 12:33:48 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a6ad7914429a22d3d835bd998b032212b776a08a

commit a6ad7914429a22d3d835bd998b032212b776a08a
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Aug 3 13:31:57 2022 +0100

    Fix a conflict between the linker's need to rename some PE format input libraries and the BFD library's file caching mechanism.
    
            PR 29389
    bfd     * bfd.c (BFD_CLOSED_BY_CACHE): New bfd flag.
            * cache.c (bfd_cache_delete): Set BFD_CLOSED_BY_DELETE on the
            closed bfd.
            (bfd_cache_lookup_worker): Clear BFD_CLOSED_BY_DELETE on the newly
            reopened bfd.
            * opncls.c (bfd_set_filename): Refuse to change the name of a bfd
            that has been closed by bfd_cache_delete.  Mark changed bfds as
            uncacheable.
            * bfd-in2.h: Regenerate.
    
    ld      * ldlang.h (lang_input_statement_struct): Add sort_key field.
            * emultempl/pe.em (after_open): If multiple import libraries refer
            to the same bfd, store their names in the sort_key field.
            * emultempl/pep.em (after_open): Likewise.
            * ldlang.c (sort_filename): New function.  Returns the filename to
            be used when sorting input files.
            (wild_sort): Use the sort_filename function.
Comment 29 Nick Clifton 2022-08-03 12:34:42 UTC
Right - patches applied.
Comment 30 Luca Bacci 2022-08-03 17:23:09 UTC
Thank you very much, Alan and Nick!

Is it safe to backport this patch to the 2_39 branch? Let me know, otherwise we can simply add a downstream patch in MSYS2.

Cheers,
Luca
Comment 31 Alan Modra 2022-08-04 05:44:08 UTC
(In reply to Luca Bacci from comment #30)
> Is it safe to backport this patch to the 2_39 branch?

You can probably answer that question better than I can, by testing the patch.  I am reasonably confident that my patch fixes a 20 year old bug without introducing new problems.  On the other hand, it's a 20 year old bug, so just on that alone there is not much urgency to release a fix.
Comment 32 Nick Clifton 2022-08-04 09:42:10 UTC
(In reply to Luca Bacci from comment #30)

> Is it safe to backport this patch to the 2_39 branch? Let me know, otherwise
> we can simply add a downstream patch in MSYS2.

I would prefer to leave the patch just on the mainline (and hence appearing in the 2.40 release).   We are so close to the 2.39 release date now, that I would prefer not to rock the boat...

Cheers
  Nick
Comment 33 Luca Bacci 2022-08-04 10:22:07 UTC
Alright, we already have a PR in MSYS2: https://github.com/msys2/MINGW-packages/pull/12426

Cheers!
Luca