commit c46e9afb2df5fc9e39ff4d13777e4b4c26e04e55 Author: H.J. Lu <hjl.tools@gmail.com> Date: Fri Oct 29 12:40:20 2021 -0700 x86-64: Improve EVEX strcmp with masked load caused Fedora 36 openJDK mock build failure: $ mock -r fedora-36-x86_64 /tmp/java-1.8.0-openjdk-1.8.0.312.b07-2.fc36.src.rpm ... gmake[6]: *** No rule to make target 'SCCS/s.generation.cpp', needed by 'generation.cpp'. Stop. gmake[6]: *** Waiting for unfinished jobs....
It can be reproduced on machines with AVX512, like Ice Lake, Tiger Lake, Skylake server, ...
Do we know if the issue is strcmp, strncmp, wcscmp, or wcsncmp? Or just the commit?
(gdb) f 0 #0 __strcmp_evex ( s1=0x55f3e18d08f8 "/export/redhat/rpms/BUILD/java-1.8.0-openjdk-1.8.0.312.b07-2.fc35.x86_64/openjdk/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/util/PathDocFileFactory.java", s2=0x55f3e18ccff3 "/export/redhat/rpms/BUILD/java-1.8.0-openjdk-1.8.0.312.b07-2.fc35.x86_64/openjdk/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/taglets/ThrowsTaglet.java") at ../sysdeps/x86_64/multiarch/strcmp-evex-check.c:14 14 asm ("hlt"); (gdb) list 9 int exp_result = __strcmp_avx2 (s1, s2); 10 int result = __strcmp_evex_real (s1, s2); 11 if ((exp_result == 0 && result != 0) 12 || (exp_result < 0 && result >= 0) 13 || (exp_result > 0 && result <= 0)) 14 asm ("hlt"); 15 16 return result; 17 } (gdb) p exp_result $1 = 1 (gdb) p result $2 = 0 (gdb)
(In reply to H.J. Lu from comment #3) > (gdb) f 0 > #0 __strcmp_evex ( > s1=0x55f3e18d08f8 > "/export/redhat/rpms/BUILD/java-1.8.0-openjdk-1.8.0.312.b07-2.fc35.x86_64/ > openjdk/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/ > util/PathDocFileFactory.java", > s2=0x55f3e18ccff3 > "/export/redhat/rpms/BUILD/java-1.8.0-openjdk-1.8.0.312.b07-2.fc35.x86_64/ > openjdk/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/ > taglets/ThrowsTaglet.java") > at ../sysdeps/x86_64/multiarch/strcmp-evex-check.c:14 > 14 asm ("hlt"); > (gdb) list > 9 int exp_result = __strcmp_avx2 (s1, s2); > 10 int result = __strcmp_evex_real (s1, s2); > 11 if ((exp_result == 0 && result != 0) > 12 || (exp_result < 0 && result >= 0) > 13 || (exp_result > 0 && result <= 0)) > 14 asm ("hlt"); > 15 > 16 return result; > 17 } > (gdb) p exp_result > $1 = 1 > (gdb) p result > $2 = 0 > (gdb) Dump of s1/s2?
Created attachment 13815 [details] A patch to add a testcase On AVX512 machines, I got [hjl@gnu-tgl-3 build-x86_64-linux]$ string/test-strcmp --direct string/test-strcmp: Wrong result in function __strcmp_evex 0 1 simple_strcmp stupid_strcmp __strcmp_avx2 __strcmp_evex __strcmp_sse42 __strcmp_ssse3 __strcmp_sse2_unaligned __strcmp_sse2 [hjl@gnu-tgl-3 build-x86_64-linux]$ Noah, any ideas?
(In reply to H.J. Lu from comment #5) > Created attachment 13815 [details] > A patch to add a testcase > > On AVX512 machines, I got > > [hjl@gnu-tgl-3 build-x86_64-linux]$ string/test-strcmp --direct > string/test-strcmp: Wrong result in function __strcmp_evex 0 1 > simple_strcmp stupid_strcmp __strcmp_avx2 > __strcmp_evex __strcmp_sse42 __strcmp_ssse3 __strcmp_sse2_unaligned > __strcmp_sse2 > [hjl@gnu-tgl-3 build-x86_64-linux]$ > > Noah, any ideas? Looks to me that coming from: L(cross_page_loop) L(main_loop_header) The number of iterations for the 4x VEC_SIZE loop is being miscomputed. Still figuring out exactly whats going on though.
Created attachment 13816 [details] Patch with bugfix.
(In reply to Noah Goldstein from comment #7) > Created attachment 13816 [details] > Patch with bugfix. This is incomplete since I still got [hjl@gnu-tgl-3 build-x86_64-linux]$ wcsmbs/test-wcscmp --direct Segmentation fault (core dumped) [hjl@gnu-tgl-3 build-x86_64-linux]$
(In reply to Noah Goldstein from comment #6) > (In reply to H.J. Lu from comment #5) > > Created attachment 13815 [details] > > A patch to add a testcase > > > > On AVX512 machines, I got > > > > [hjl@gnu-tgl-3 build-x86_64-linux]$ string/test-strcmp --direct > > string/test-strcmp: Wrong result in function __strcmp_evex 0 1 > > simple_strcmp stupid_strcmp __strcmp_avx2 > > __strcmp_evex __strcmp_sse42 __strcmp_ssse3 __strcmp_sse2_unaligned > > __strcmp_sse2 > > [hjl@gnu-tgl-3 build-x86_64-linux]$ > > > > Noah, any ideas? > > Looks to me that coming from: > > L(cross_page_loop) > L(main_loop_header) > > The number of iterations for the 4x VEC_SIZE loop is being miscomputed. I'm incorrect about this. Was misunderstanding the control flow. Although this input hits an extremely unfortunate slow case. > > Still figuring out exactly whats going on though. Issue is that since we are only using part of the comparison (the upper 64 - %cl) bits, we can't rely on the fact that first zero means exit condition so using `incl` can set the wrong value. Don't see any other places in the code but still looking.
Created attachment 13817 [details] Possibly Fix Missed the other side. Try the attached.
(In reply to Noah Goldstein from comment #10) > Created attachment 13817 [details] > Possibly Fix > > Missed the other side. Try the attached. Sorry thats no good either for wcscmp. Still looking into it.
(In reply to H.J. Lu from comment #8) > (In reply to Noah Goldstein from comment #7) > > Created attachment 13816 [details] > > Patch with bugfix. > > This is incomplete since I still got Issue is strings need to be aligned % sizeof(wchar_t) for wcscmp. The check3 test has s2 misaligned. Don't think there is an issue there. > > [hjl@gnu-tgl-3 build-x86_64-linux]$ wcsmbs/test-wcscmp --direct > Segmentation fault (core dumped) > [hjl@gnu-tgl-3 build-x86_64-linux]$
Created attachment 13818 [details] An updated test patch
(In reply to Noah Goldstein from comment #12) > (In reply to H.J. Lu from comment #8) > > (In reply to Noah Goldstein from comment #7) > > > Created attachment 13816 [details] > > > Patch with bugfix. > > > > This is incomplete since I still got > > Issue is strings need to be aligned % sizeof(wchar_t) for wcscmp. > The check3 test has s2 misaligned. > > Don't think there is an issue there. > > > > [hjl@gnu-tgl-3 build-x86_64-linux]$ wcsmbs/test-wcscmp --direct > > Segmentation fault (core dumped) > > [hjl@gnu-tgl-3 build-x86_64-linux]$ You are right. Please add some comments to describe why INC can't be used.
Created attachment 13819 [details] Patch with explanation Added commit explaining the bug in commit message and comments in the code around the fix. Note the issue with wcscmp was alignment of s2, s2 was type CHAR so adding 0xcff3 was multiplied by sizeof(CHAR) and was out of range of the allocation.
Created attachment 13820 [details] A patch I am testing this.
(In reply to H.J. Lu from comment #16) > Created attachment 13820 [details] > A patch > > I am testing this. From Tigerlake: ran xcheck and looked normal. checked and all wcsmbs / string functions passed.
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4df1fa6ddc8925a75f3da644d5da3bb16eb33f02 commit 4df1fa6ddc8925a75f3da644d5da3bb16eb33f02 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Fri Dec 3 15:29:25 2021 -0800 x86-64: Use notl in EVEX strcmp [BZ #28646] Must use notl %edi here as lower bits are for CHAR comparisons potentially out of range thus can be 0 without indicating mismatch. This fixes BZ #28646. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Fixed.
The release/2.34/master branch has been updated by Sunil Pandey <skpgkp2@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4bbd0f866ad0ff197f72346f776ebee9b7e1a706 commit 4bbd0f866ad0ff197f72346f776ebee9b7e1a706 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Fri Dec 3 15:29:25 2021 -0800 x86-64: Use notl in EVEX strcmp [BZ #28646] Must use notl %edi here as lower bits are for CHAR comparisons potentially out of range thus can be 0 without indicating mismatch. This fixes BZ #28646. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 4df1fa6ddc8925a75f3da644d5da3bb16eb33f02)
The release/2.33/master branch has been updated by Sunil Pandey <skpgkp2@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d77349767a308ac3efa5f113fb63fe39c82e5083 commit d77349767a308ac3efa5f113fb63fe39c82e5083 Author: Noah Goldstein <goldstein.w.n@gmail.com> Date: Fri Dec 3 15:29:25 2021 -0800 x86-64: Use notl in EVEX strcmp [BZ #28646] Must use notl %edi here as lower bits are for CHAR comparisons potentially out of range thus can be 0 without indicating mismatch. This fixes BZ #28646. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 4df1fa6ddc8925a75f3da644d5da3bb16eb33f02)