This is a forward of a downstream report in Gentoo: https://bugs.gentoo.org/823780, although I've hit it myself. Building glibc-2.34 fails with the following error: ``` /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: /var/tmp/portage/sys-libs/glibc-2.34-r1/work/build-x86-x86_64-pc-linux-gnu-nptl/libc.a(inet_addr.o): TLS transition from R_386_TLS_GOTIE to R_386_TLS_LE_32 against `__libc_tsd_CTYPE_B' at 0xf4 in section `.text' failed /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status make[2]: *** [../Rules:269: /var/tmp/portage/sys-libs/glibc-2.34-r1/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ldconfig] Error 1 ``` In the bug, someone observed that it may be AVX512 related which matches my experience (I can only make the build fail on a Tiger Lake system).
Created attachment 13787 [details] build.log (from user downstream)
Created attachment 13788 [details] build.log (from user downstream)
I can only reproduce this with GCC 11, and not GCC 10. Both latest binutils from git and binutils-2.37 (from near the tip of the stable branch - a few days ago) fails.
What does -march=native resolve to?
Pulling from the downstream bug report: > -march=tigerlake - ERROR > -march=cooperlake - ERROR > -march=cascadelake - ERROR > -march=cannonlake - ERROR > -march=skylake-avx512 - ERROR > -march=skylake - OK And from my own failure, I noticed that this is hit during the -m32 build. I don't know whether that's a relevant detail, but it is a detail: x86_64-pc-linux-gnu-gcc -m32 -march=native -pipe -O2 -Wl,-O1 -Wl,--as-needed -nostdlib -nostartfiles -static -o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ldconfig -Wl,-O1 -Wl,--as-needed /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/csu/crt1.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/csu/crti.o `x86_64-pc-linux-gnu-gcc -m32 -march=native -pipe -O2 -Wl,-O1 -Wl,--as-needed --print-file-name=crtbeginT.o` /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/ldconfig.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/cache.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/readlib.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/xmalloc.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/xstrdup.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/chroot_canon.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/static-stubs.o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/elf/stringtable.o -Wl,-z,now -Wl,--start-group /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/libc.a -lgcc -Wl,--end-group `x86_64-pc-linux-gnu-gcc -m32 -march=native -pipe -O2 -Wl,-O1 -Wl,--as-needed --print-file-name=crtend.o` /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/csu/crtn.o make[4]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.34-r2/work/glibc-2.34/nptl' /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/libc.a(inet_addr.o): TLS transition from R_386_TLS_GOTIE to R_386_TLS_LE_32 against `__libc_tsd_CTYPE_B' at 0xf4 in section `.text' failed /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status
The two related-ish recent changes both seem C++ related and not avx512-related. Hm. https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=4e62aca0e0520e4ed2532f2d8153581190621c1a https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ee8f9ff00d79998274c967ad0c23692be9dd3ada
Wow, gcc is wild. It looks like it's converting that ctypes indexing into a kmovd, into the mask register (even in 32-bit mode, woah): movsx eax, dl kmovd k0, ds:(__libc_tsd_CTYPE_B_tpoff - _GLOBAL_OFFSET_TABLE_)[ebx] kmovd edx, k0 kmovd k0, dword ptr gs:[edx] kmovd edx, k0 test byte ptr [edx+eax*2+1], 20h jz short loc_80000B0 Presumably when this happens, gcc isn't emitting the right type of relocation. It assumes it's a local-exec instead of an init-exec, or something like that? Notably, that symbol is declared with `__attribute__ ((tls_model ("initial-exec")))`.
If you suspect a gcc bug, you should file it into gcc.gnu.org bugzilla with preprocessed source on which you encounter it and full command line option.
I'm really not quite sure, actually. Does it seem like a potential gcc bug to you?
Gcc seems to emit the same sort of reference. On -march=skylake that same snippet is: mov eax, ecx mov ecx, ds:(__libc_tsd_CTYPE_B_tpoff - _GLOBAL_OFFSET_TABLE_)[ebx] mov ecx, gs:[ecx] test byte ptr [ecx+eax*2+1], 20h jz short loc_80000B0 So I wonder if we're exposing a binutils bug here rather than a glibc or gcc one? Still not super certain what's up, though.
And in both cases, the gcc-generated .o yields a `000000f0 R_386_TLS_GOTIE __libc_tsd_CTYPE_B`. Is ld supposed to be changing this into a R_386_TLS_LE_32, but it doesn't recognize it somehow when it's used from kmov?
(In reply to Jason A. Donenfeld from comment #7) > Wow, gcc is wild. It looks like it's converting that ctypes indexing into a > kmovd, into the mask register (even in 32-bit mode, woah): > > movsx eax, dl > kmovd k0, ds:(__libc_tsd_CTYPE_B_tpoff - _GLOBAL_OFFSET_TABLE_)[ebx] > kmovd edx, k0 > kmovd k0, dword ptr gs:[edx] > kmovd edx, k0 > test byte ptr [edx+eax*2+1], 20h > jz short loc_80000B0 Could you please attach the assembler output from --save-temps to this bug (the .s file), and also the preprocessed sources (the .i file)? Thanks. This is either a GCC bug or an assembler bug.
The GCC-bug side of things seems to be a bad optimization caused by weird register allocator performance. That's being tracked here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252 The .o that gcc creates for both the avx512 case and the non-avx512 case both have the same relocation type: 000000f0 R_386_TLS_GOTIE __libc_tsd_CTYPE_B That makes me suspect that the problem isn't with gcc but is with something in binutils -- the assembler or the linker or something called later on. Either way, I'll attach the preprocessed source and assembly as requested.
Created attachment 13789 [details] preprocessed source thinkpad /var/tmp/portage/sys-libs/glibc-2.34-r2/work/glibc-2.34/resolv # x86_64-pc-linux-gnu-gcc -m32 -march=native -pipe -O2 -Wl,-O1 -Wl,--as-needed inet_addr.c -c -std=gnu11 -fgnu89-inline -march=native -pipe -O2 -Wall -Wwrite-strings -Wundef -fmerge-all-constants -frounding-math -fstack-protector-strong -fno-common -Wstrict-prototypes -Wold-style-definition -fmath-errno -Wa,-mtune=i686 -ftls-model=initial-exec -U_FORTIFY_SOURCE -I../include -I/var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/resolv -I/var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl -I../sysdeps/unix/sysv/linux/i386/i686 -I../sysdeps/i386/i686/nptl -I../sysdeps/unix/sysv/linux/i386 -I../sysdeps/unix/sysv/linux/x86/include -I../sysdeps/unix/sysv/linux/x86 -I../sysdeps/x86/nptl -I../sysdeps/i386/nptl -I../sysdeps/unix/sysv/linux/include -I../sysdeps/unix/sysv/linux -I../sysdeps/nptl -I../sysdeps/pthread -I../sysdeps/gnu -I../sysdeps/unix/inet -I../sysdeps/unix/sysv -I../sysdeps/unix/i386 -I../sysdeps/unix -I../sysdeps/posix -I../sysdeps/i386/i686/fpu/multiarch -I../sysdeps/i386/i686/fpu -I../sysdeps/i386/i686/multiarch -I../sysdeps/i386/i686 -I../sysdeps/i386/fpu -I../sysdeps/x86/fpu -I../sysdeps/i386 -I../sysdeps/x86/include -I../sysdeps/x86 -I../sysdeps/wordsize-32 -I../sysdeps/ieee754/float128 -I../sysdeps/ieee754/ldbl-96/include -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic -I.. -I../libio -I. -nostdinc -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/include -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/include-fixed -isystem /usr/include -D_LIBC_REENTRANT -include /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/libc-modules.h -DMODULE_NAME=libc -include ../include/libc-symbols.h -DPIC -DTOP_NAMESPACE=glibc -o /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/resolv/inet_addr.o -MD -MP -MF /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/resolv/inet_addr.o.dt -MT /var/tmp/portage/sys-libs/glibc-2.34-r2/work/build-x86-x86_64-pc-linux-gnu-nptl/resolv/inet_addr.o -E My -march=native should expand to: -march=tigerlake -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -mno-sse4a -mno-fma4 -mno-xop -mfma -mavx512f -mbmi -mbmi2 -maes -mpclmul -mavx512vl -mavx512bw -mavx512dq -mavx512cd -mno-avx512er -mno-avx512pf -mavx512vbmi -mavx512ifma -mno-avx5124vnniw -mno-avx5124fmaps -mavx512vpopcntdq -mavx512vbmi2 -mgfni -mvpclmulqdq -mavx512vnni -mavx512bitalg -mno-avx512bf16 -mavx512vp2intersect -mno-3dnow -madx -mabm -mno-cldemote -mclflushopt -mclwb -mno-clzero -mcx16 -mno-enqcmd -mf16c -mfsgsbase -mfxsr -mno-hle -msahf -mno-lwp -mlzcnt -mmovbe -mmovdir64b -mmovdiri -mno-mwaitx -mno-pconfig -mpku -mno-prefetchwt1 -mprfchw -mno-ptwrite -mrdpid -mrdrnd -mrdseed -mno-rtm -mno-serialize -mno-sgx -msha -mshstk -mno-tbm -mno-tsxldtrk -mvaes -mno-waitpkg -mno-wbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -mno-amx-tile -mno-amx-int8 -mno-amx-bf16 -mno-uintr -mno-hreset -mno-kl -mno-widekl -mno-avxvnni --param "l1-cache-size=48" --param "l1-cache-line-size=64" --param "l2-cache-size=24576" -mtune=tigerlake
Created attachment 13790 [details] assembler intermediate file The relevant diff from a symbol point of view is just: - movl __libc_tsd_CTYPE_B@gotntpoff(%ebx), %ecx + kmovd __libc_tsd_CTYPE_B@gotntpoff(%ebx), %k0
Adding Nick to the CC, as this is looking more like a binutils issue.
I might be way off here, but... case R_386_TLS_GOTIE: case R_386_TLS_IE_32: /* Check transition from {IE_32,GOTIE} access model: subl foo@{tpoff,gontoff}(%reg1), %reg2 movl foo@{tpoff,gontoff}(%reg1), %reg2 addl foo@{tpoff,gontoff}(%reg1), %reg2 */ if (offset < 2 || (offset + 4) > sec->size) return false; val = bfd_get_8 (abfd, contents + offset - 1); if ((val & 0xc0) != 0x80 || (val & 7) == 4) return false; type = bfd_get_8 (abfd, contents + offset - 2); return type == 0x8b || type == 0x2b || type == 0x03; It seems like it's looking at subl, movl (0x8b), and addl, but bfd does not support the TLS_GOTIE relocation for kmovd.
This boils down to: 1.c: extern __thread int mytls __attribute__((tls_model ("initial-exec"))); __attribute__((noipa)) void foo (void) { asm (""); } int main () { foo (); #ifdef KMOVD asm volatile ("kmovd mytls@gotntpoff(%%ebx), %%k0" : : : "k0"); #else volatile int a = mytls; #endif return 0; } 2.c: __thread int mytls __attribute__((tls_model ("initial-exec"))); $ gcc -O2 -m32 -fPIC /tmp/1.c /tmp/2.c -o /tmp/1 -UKMOVD -mavx512f $ gcc -O2 -m32 -fPIC /tmp/1.c /tmp/2.c -o /tmp/1 -DKMOVD -mavx512f /usr/bin/ld: /tmp/ccQrvWxU.o: TLS transition from R_386_TLS_GOTIE to R_386_TLS_LE_32 against `mytls' at 0x1c in section `.text.startup' failed /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status https://akkadia.org/drepper/tls.pdf I don't remember anymore whether it is ok if some of the TLS relocations appear also in instructions not exactly mentioned in the above mentioned pdf and if instead of giving the linker errors like the above the linker should just not optimize a particular TLS model, i.e. still use IE model here when it can't optimize it into LE model (then it would be a linker bug), or whether the compiler needs to ensure that those @gotntpoff etc. appear solely in the instructions mentioned in the pdf (here in movl). In that case it would be a gcc bug, e.g. for the @gotntpoff MEMs it would need to disallow them by default and only accept them as a special case in a simple *movsi_internal.
Indeed it looks like with a bit of mucking around in there I can have it accept kmovd instructions and change the relative address around, sort of. Maybe Ulrich's got an opinion on what he'd prefer here. Also, I don't have access to do so, but I suppose we ought to reclassify this as a binutils bug for now, since we're on the same bugzilla as that, and then later if necessary we can open a bug on gcc's tracker. Whether it's bfd or gcc, it's certainly not glibc.
I think the relocations can't be placed into arbitrary instructions because the linker must be able to find the start of the instruction to adjust the instruction prefix. So GAS really should not assemble this code.
GCC side bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d7e3e627027fcf37d63e284144fe27ff4eba36b5 commit d7e3e627027fcf37d63e284144fe27ff4eba36b5 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Nov 16 07:21:11 2021 -0800 x86: Don't allow KMOV in TLS code sequences Don't allow KMOV in TLS code sequences which require integer MOV instructions. PR target/28595 * config/tc-i386.c (match_template): Don't allow KMOV in TLS code sequences. * testsuite/gas/i386/i386.exp: Run inval-tls and x86-64-inval-tls tests. * testsuite/gas/i386/inval-tls.l: New file. * testsuite/gas/i386/inval-tls.s: Likewise. * testsuite/gas/i386/x86-64-inval-tls.l: Likewise. * testsuite/gas/i386/x86-64-inval-tls.s: Likewise.
Fixed for 2.38.
Fixed.
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d660d20c0ce197fc195f3f5ac1c908009b520c7e commit d660d20c0ce197fc195f3f5ac1c908009b520c7e Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 27 05:58:32 2024 -0700 x86: Allow R_386_TLS_LE_32 with KMOVD Since there is no TLS IE transition, allow R_386_TLS_LE_32 with KMOVD. gas/ PR gas/28595 * config/tc-i386.c (i386_assemble): Remove BFD_RELOC_386_TLS_LE_32 from TLS code check. * testsuite/gas/i386/inval-tls.s: Remove foo@tpoff(%eax). * testsuite/gas/i386/inval-tls.l: Updated. ld/ PR gas/28595 * testsuite/ld-i386/i386.exp: Run tlsle1. * testsuite/ld-i386/tlsle1.d: New file. * testsuite/ld-i386/tlsle1.s: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=24e3920d1d84ce13cc4c373918ecf1daeda5db66 commit 24e3920d1d84ce13cc4c373918ecf1daeda5db66 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 27 09:48:21 2024 -0700 x86: Report invalid TLS relocation name Get TLS relocation name from its lex_got entry when reporting invalid instructions with TLS relocations. PR gas/28595 * config/tc-i386.c (gotrel): Moved from ... (lex_got): There. (i386_assemble): Get invalid TLS relocation name from its lex_got entry when reporting TLS relocation error. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=903ae636569889dbfab81356f81831041bc03a8f commit 903ae636569889dbfab81356f81831041bc03a8f Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Aug 28 04:30:08 2024 -0700 x86: Report invalid TLS operator Report invalid TLS operator, instead of relocation. PR gas/28595 * config/tc-i386.c (gotrel): Replace int with unsigned int. (i386_assemble): Report invalid TLS operator. * testsuite/gas/i386/inval-tls.l: updated. * testsuite/gas/i386/x86-64-inval-tls.l: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>