Bug 28885

Summary: dlltool broke in 2.38
Product: binutils Reporter: Mikael Pettersson <mikpelinux>
Component: binutilsAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: bshanks, martin
Priority: P2    
Version: 2.38   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Potential fix

Description Mikael Pettersson 2022-02-12 15:27:37 UTC
With binutils-2.38 in a cross to x86_64-w64-mingw32 I persistently see random breakage in dlltool during the build of mingw-w64's "crt".

Example 1:
...
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dcompiler_33.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dcompiler_33.def
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dcompiler_34.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dcompiler_34.def
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dcompiler_35.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dcompiler_35.def
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dcompiler_36.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dcompiler_36.def
Assembler messages:
Error: can't open D3DCompiler_dll_t.s for reading: No such file or directory
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dcompiler_37.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dcompiler_37.def
x86_64-w64-mingw32-dlltool: x86_64-w64-mingw32-as exited with status 1
x86_64-w64-mingw32-dlltool: failed to open temporary tail file: D3DCompiler_dll_t.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00000.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00001.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00002.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00003.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00004.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00005.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00006.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00007.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00008.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00009.o: No such file or directory
make[1]: *** [Makefile:83854: lib64/libd3dcompiler_36.a] Error 1
make[1]: *** Waiting for unfinished jobs....
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00000.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00001.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00002.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00003.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00004.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00005.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00006.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00007.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00008.o: No such file or directory
x86_64-w64-mingw32-dlltool: cannot delete D3DCompiler_dll_s00009.o: No such file or directory

Example 2:
...
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dx9.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dx9_43.def
x86_64-w64-mingw32-dlltool --as-flags=--64 -m i386:x86-64 -k --as=x86_64-w64-mingw32-as --output-lib lib64/libd3dx10.a  --input-def /tmp/mingw-w64-v9.0.0/mingw-w64-crt/lib64/d3dx10_43.def
x86_64-w64-mingw32-dlltool: lib32/libd3dx9.a: error reading d3dx9_43_dll_h.o: file truncated
make[1]: *** [Makefile:83819: lib32/libd3dx9.a] Error 1
make[1]: *** Waiting for unfinished jobs....
x86_64-w64-mingw32-dlltool: lib64/libd3dx9.a: error reading d3dx9_43_dll_t.o: No such file or directory

I've also seen it produce .a files that ar complained about containing invalid or truncated files.

These random failures occur persistently when using binutils-2.38 and parallel make (make -jN) for mingw-w64's crt. The failures disappear if I use non-parallel make, or revert to binutils-2.37.

A git bisect identified this late change in 2.38 development as the cause:

# first bad commit: [fdeee5d59dca41e3c70c399a939105e39a4b4282] Allow inferring tmp_prefix from the dll name from a def file

which makes sense since it touches dlltool.c.

Host: x86_64-pc-linux-gnu (Fedora 34)

Steps for building cross to mingw-w64 (each in a separate build dir):

/tmp/binutils-2.38/configure --target=x86_64-w64-mingw32 --enable-targets=x86_64-w64-mingw32,i686-w64-mingw32 --prefix=/tmp/cross-mingw64 --with-sysroot=/tmp/cross-mingw64 --disable-gdb --disable-gold --disable-nls --disable-plugins --disable-readline --disable-sim; make -j; make install
/tmp/mingw-w64-v9.0.0/mingw-w64-headers/configure --prefix=/tmp/cross-mingw64/x86_64-w64-mingw32 --host=x86_64-w64-mingw32; make install; ln -sf x86_64-w64-mingw32 /tmp/cross-mingw64/mingw
/tmp/gcc-11.2.0/configure --target=x86_64-w64-mingw32 --prefix=/tmp/cross-mingw64 --with-sysroot=/tmp/cross-mingw64 --disable-libatomic --disable-libgomp --disable-libitm --disable-libmpx --disable-libquadmath --disable-libsanitizer --disable-lto --disable-nls --disable-plugin --disable-shared --enable-checking=release --enable-multilib --enable-64bit --with-dwarf --enable-threads=win32 --enable-languages=c; make -j all-gcc; make install-gcc
/tmp/mingw-w64-v9.0.0/mingw-w64-crt/configure --prefix=/tmp/cross-mingw64/x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --enable-lib32 --enable-lib64; make -j
(it's the last step above that fails in dlltool)
(to be followed by install of crt and rebuild of full gcc)
Comment 1 Brendan Shanks 2022-03-10 22:12:22 UTC
I think the problem is that multiple instances of dlltool may use the same temporary file names depending on the files being built.

With fdeee5d59dca41e3c70c399a939105e39a4b4282, the temporary file prefix will be taken from the .def file, and the .def files may not have unique names. For example, in mingw-w64-crt/lib64, d3dcompiler_{33,34,35,36}.def all have the same LIBRARY name inside: "D3DCompiler.dll" [1].

I've also seen this bug when building Wine (on both Linux and macOS), where multiple dlltools may be launched at the same time to build delay-load and cross libraries for the same DLL (using the same .spec as input). 

I'm not sure of the best fix for this: maybe add a command-line option to use a deterministic tmp_prefix, with the caveat that parallel builds may fail?

(As an aside, I'm not sure why one would want deterministic tmp file names. Maybe for reproducible builds?)


[1]: https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-crt/lib64/d3dcompiler_33.def
Comment 2 Martin Storsjö 2022-03-11 08:28:30 UTC
(In reply to Brendan Shanks from comment #1)
> With fdeee5d59dca41e3c70c399a939105e39a4b4282, the temporary file prefix
> will be taken from the .def file, and the .def files may not have unique
> names. For example, in mingw-w64-crt/lib64, d3dcompiler_{33,34,35,36}.def
> all have the same LIBRARY name inside: "D3DCompiler.dll" [1].

Sorry about this - I really didn't forsee that this would be an issue.

> I've also seen this bug when building Wine (on both Linux and macOS), where
> multiple dlltools may be launched at the same time to build delay-load and
> cross libraries for the same DLL (using the same .spec as input). 

In wine, what cases of multiple parallel import libraries targeting the same DLL name?

> I'm not sure of the best fix for this: maybe add a command-line option to
> use a deterministic tmp_prefix, with the caveat that parallel builds may
> fail?

That doesn't sound very ideal to me. Ideally it would do the right thing, deterministically, out of the box, without conflict.

If the issue is that the deterministic temp file names can clash when running multiple dlltool invocations in parallel, should we perhaps create a separate temp directory (where the name can be nondeterministic), where we make sure we get a unique directory that nobody else is using, then write all temp files within that one?

Or should we make the deterministic temp prefix based on something else, e.g. the ouput (path+)filename? That should be unique even with multiple jobs running in parallel, if they're running in the same directory. The temp prefix base might end up slightly longer, so it's maybe a little less elegant, but should be much safer against clashes.

> (As an aside, I'm not sure why one would want deterministic tmp file names.
> Maybe for reproducible builds?)

Msys2 had issues with the nondeterministic nature of dlltool. In most cases, the temp prefix doesn't matter - but for umbrella libraries like libucrt.a, which is merged from a number of individual import libraries like this, https://github.com/mingw-w64/mingw-w64/blob/master/mingw-w64-crt/lib-common/ucrt.mri, it matters: The temp prefix for all the individual libraries making up libucrt.a must be unique, otherwise ld.bfd mixes up the object files for those libraries, producing intermixed import tables for those DLLs. See https://github.com/msys2/MINGW-packages/issues/9363#issuecomment-899100856 for the root cause.

We worked around this with https://github.com/mingw-w64/mingw-w64/commit/0ad299826ca14987fd53664c1047f4dfe848c3f8, which adds the --temp-prefix option (based on the output import library name, which should be unique) if the option is supported.

We also noticed that Debian had been using the --temp-prefix option in this way to make their builds reproducible: https://salsa.debian.org/mingw-w64-team/mingw-w64/-/blob/master/debian/patches/dlltool-temp-prefix.patch

Based on this, I thought it'd be good if we could make dlltool fix both of these issues out of the box, so that users of it don't need to take extra steps to both achieve reproducibility and avoid the intermixed import tables.

(Also, if building a newer version of mingw-w64 that contains the commit above, which explicitly sets the --temp-prefx option, I think this bug wouldn't be noticed at all. So, good that you've caught it at least!)
Comment 3 Martin Storsjö 2022-03-11 08:38:55 UTC
Created attachment 14013 [details]
Potential fix

Here's a pretty straightforward patch that uses the output filename as basis for the temp prefix, which I think should avoid this issue (if there's no other aspect I'm overlooking). Does this fix the issue for you?
Comment 4 Brendan Shanks 2022-03-11 18:55:54 UTC
(In reply to Martin Storsjö from comment #3)
> Created attachment 14013 [details]
> Potential fix
> 
> Here's a pretty straightforward patch that uses the output filename as basis
> for the temp prefix, which I think should avoid this issue (if there's no
> other aspect I'm overlooking). Does this fix the issue for you?

Thanks Martin, that worked for me for a Wine build. Mikael, can you try building mingw-w64's crt?
Comment 5 Martin Storsjö 2022-03-11 20:23:05 UTC
(In reply to Brendan Shanks from comment #4)
> (In reply to Martin Storsjö from comment #3)
> > Created attachment 14013 [details]
> > Potential fix
> > 
> > Here's a pretty straightforward patch that uses the output filename as basis
> > for the temp prefix, which I think should avoid this issue (if there's no
> > other aspect I'm overlooking). Does this fix the issue for you?
> 
> Thanks Martin, that worked for me for a Wine build. Mikael, can you try
> building mingw-w64's crt?

I managed to reproduce the issue building mingw-w64-crt, and the patch does fix that case too. (I could also verify that building the latest git master version of mingw-w64-crt isn't affected by the bug, as those versions explicitly specify unique --temp-prefix strings.)

I'll go ahead and send the patch for review then.
Comment 6 Mikael Pettersson 2022-03-12 12:22:12 UTC
(In reply to Brendan Shanks from comment #4)
> (In reply to Martin Storsjö from comment #3)
> > Created attachment 14013 [details]
> > Potential fix
> > 
> > Here's a pretty straightforward patch that uses the output filename as basis
> > for the temp prefix, which I think should avoid this issue (if there's no
> > other aspect I'm overlooking). Does this fix the issue for you?
> 
> Thanks Martin, that worked for me for a Wine build. Mikael, can you try
> building mingw-w64's crt?

After applying the proposed fix on top of binutils-2.38 I can no longer reproduce the problem.
Comment 7 Sourceware Commits 2022-03-16 15:22:12 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d65c0ddddd85645cab6f11fd711d21638a74489f

commit d65c0ddddd85645cab6f11fd711d21638a74489f
Author: Martin Storsj? <martin@martin.st>
Date:   Wed Mar 16 15:01:30 2022 +0000

    dlltool: Use the output name as basis for deterministic temp prefixes
    
            PR 28885
            * dlltool.c (main): use imp_name rather than dll_name when
            generating a temporary file name.
Comment 8 Mikael Pettersson 2022-03-20 11:44:37 UTC
Fixed on master for 2.39, needs backport to binutils-2_38-branch.
Comment 9 Sourceware Commits 2022-03-23 11:40:18 UTC
The binutils-2_38-branch branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=99852365513266afdd793289813e8e565186c9e6

commit 99852365513266afdd793289813e8e565186c9e6
Author: Nick Clifton <nickc@redhat.com>
Date:   Wed Mar 23 11:39:49 2022 +0000

    dlltool: Use the output name as basis for deterministic temp prefixes
    
            PR 28885
            * dlltool.c (main): use imp_name rather than dll_name when
            generating a temporary file name.
Comment 10 Mikael Pettersson 2022-03-24 13:35:02 UTC
And now fixed on binutils-2_38-branch, closing.