Bug 30065 - Segfault in AVX-2 strncat implementation (gnulib)
Summary: Segfault in AVX-2 strncat implementation (gnulib)
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: string (show other bugs)
Version: 2.37
: P2 normal
Target Milestone: 2.37
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-31 15:55 UTC by Simon Chopin
Modified: 2023-02-01 02:52 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Chopin 2023-01-31 15:55:16 UTC
I did some tests on master (2f39e44a84) for the upcoming 2.37 release, and I found a regression in libunistring test suite on amd64 with AVX-2 instructions, more specifically on gnulib's test-strncat. It can be reproduced using these instructions:

https://sourceware.org/glibc/wiki/Testing/Gnulib


I bisected the issue to
commit 642933158e7cf072d873231b1a9bb03291f2b989
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Tue Nov 8 17:38:39 2022 -0800

    x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
    
    Optimizations are:
        1. Use more overlapping stores to avoid branches.
        2. Reduce how unrolled the aligning copies are (this is more of a
           code-size save, its a negative for some sizes in terms of
           perf).
        3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
           number that are taken.

I get the following backtrace:

#0  __strncat_avx2 () at ../sysdeps/x86_64/multiarch/strncat-avx2.S:76
#1  0x00005555555555d7 in strncat (__len=0, __src=0x7ffff7de4000 "", __dest=0x55555555c2a1 "") at /tmp/glibc-dev/include/bits/string_fortified.h:138
#2  check_single (input=input@entry=0x7ffff7de4000 "", n=n@entry=0, length=90) at unistr/test-strncat.h:41
#3  0x0000555555555352 in check (input=0x555555559100 <input> "Grüß Gott. Здравствуйте! x=(-b±sqrt(b²-4ac))/(2a)  日本語,中文,한글", input_length=91)
    at unistr/test-strncat.h:86
#4  main () at test-strncat.c:58

I'm out of my depth in the assembler code.
Comment 1 Noah Goldstein 2023-01-31 19:41:55 UTC
(In reply to Simon Chopin from comment #0)
> I did some tests on master (2f39e44a84) for the upcoming 2.37 release, and I
> found a regression in libunistring test suite on amd64 with AVX-2
> instructions, more specifically on gnulib's test-strncat. It can be
> reproduced using these instructions:
> 
> https://sourceware.org/glibc/wiki/Testing/Gnulib

Those directions seem a bit outdated, any chance you can share your build steps?

> 
> 
> I bisected the issue to
> commit 642933158e7cf072d873231b1a9bb03291f2b989
> Author: Noah Goldstein <goldstein.w.n@gmail.com>
> Date:   Tue Nov 8 17:38:39 2022 -0800
> 
>     x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
>     
>     Optimizations are:
>         1. Use more overlapping stores to avoid branches.
>         2. Reduce how unrolled the aligning copies are (this is more of a
>            code-size save, its a negative for some sizes in terms of
>            perf).
>         3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
>            number that are taken.
> 
> I get the following backtrace:
> 
> #0  __strncat_avx2 () at ../sysdeps/x86_64/multiarch/strncat-avx2.S:76
> #1  0x00005555555555d7 in strncat (__len=0, __src=0x7ffff7de4000 "",
> __dest=0x55555555c2a1 "") at
> /tmp/glibc-dev/include/bits/string_fortified.h:138
> #2  check_single (input=input@entry=0x7ffff7de4000 "", n=n@entry=0,
> length=90) at unistr/test-strncat.h:41
> #3  0x0000555555555352 in check (input=0x555555559100 <input> "Grüß Gott.
> Здравствуйте! x=(-b±sqrt(b²-4ac))/(2a)  日本語,中文,한글", input_length=91)
>     at unistr/test-strncat.h:86
> #4  main () at test-strncat.c:58
> 
> I'm out of my depth in the assembler code.

So far unable to reproduce, haven't been able to build gnulib against installed GLIBC, but pulled out the u8/u32 strncat and tested them both.

As well have done exhaustive strncat/wcsncpy at the end of page (exhaustive for len {0..128} with all alignments {4096-128...4095} for s1/s2. So far unable to reproduce. Maybe the issue is uninitialized register.
Comment 2 Noah Goldstein 2023-01-31 20:09:11 UTC
(In reply to Simon Chopin from comment #0)
> I did some tests on master (2f39e44a84) for the upcoming 2.37 release, and I
> found a regression in libunistring test suite on amd64 with AVX-2
> instructions, more specifically on gnulib's test-strncat. It can be
> reproduced using these instructions:
> 
> https://sourceware.org/glibc/wiki/Testing/Gnulib
> 
> 
> I bisected the issue to
> commit 642933158e7cf072d873231b1a9bb03291f2b989
> Author: Noah Goldstein <goldstein.w.n@gmail.com>
> Date:   Tue Nov 8 17:38:39 2022 -0800
> 
>     x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
>     
>     Optimizations are:
>         1. Use more overlapping stores to avoid branches.
>         2. Reduce how unrolled the aligning copies are (this is more of a
>            code-size save, its a negative for some sizes in terms of
>            perf).
>         3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
>            number that are taken.
> 
> I get the following backtrace:
> 
> #0  __strncat_avx2 () at ../sysdeps/x86_64/multiarch/strncat-avx2.S:76

Do you know what instruction its segfaulting at?
> #1  0x00005555555555d7 in strncat (__len=0, __src=0x7ffff7de4000 "",
> __dest=0x55555555c2a1 "") at
> /tmp/glibc-dev/include/bits/string_fortified.h:138
> #2  check_single (input=input@entry=0x7ffff7de4000 "", n=n@entry=0,
> length=90) at unistr/test-strncat.h:41
> #3  0x0000555555555352 in check (input=0x555555559100 <input> "Grüß Gott.
> Здравствуйте! x=(-b±sqrt(b²-4ac))/(2a)  日本語,中文,한글", input_length=91)
>     at unistr/test-strncat.h:86
> #4  main () at test-strncat.c:58
> 
> I'm out of my depth in the assembler code.
Comment 3 Noah Goldstein 2023-01-31 20:41:45 UTC
(In reply to Noah Goldstein from comment #2)
> (In reply to Simon Chopin from comment #0)
> > I did some tests on master (2f39e44a84) for the upcoming 2.37 release, and I
> > found a regression in libunistring test suite on amd64 with AVX-2
> > instructions, more specifically on gnulib's test-strncat. It can be
> > reproduced using these instructions:
> > 
> > https://sourceware.org/glibc/wiki/Testing/Gnulib
> > 
> > 
> > I bisected the issue to
> > commit 642933158e7cf072d873231b1a9bb03291f2b989
> > Author: Noah Goldstein <goldstein.w.n@gmail.com>
> > Date:   Tue Nov 8 17:38:39 2022 -0800
> > 
> >     x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
> >     
> >     Optimizations are:
> >         1. Use more overlapping stores to avoid branches.
> >         2. Reduce how unrolled the aligning copies are (this is more of a
> >            code-size save, its a negative for some sizes in terms of
> >            perf).
> >         3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
> >            number that are taken.
> > 
> > I get the following backtrace:
> > 
> > #0  __strncat_avx2 () at ../sysdeps/x86_64/multiarch/strncat-avx2.S:76
> 
> Do you know what instruction its segfaulting at?
> > #1  0x00005555555555d7 in strncat (__len=0, __src=0x7ffff7de4000 "",
> > __dest=0x55555555c2a1 "") at
> > /tmp/glibc-dev/include/bits/string_fortified.h:138
> > #2  check_single (input=input@entry=0x7ffff7de4000 "", n=n@entry=0,
> > length=90) at unistr/test-strncat.h:41
> > #3  0x0000555555555352 in check (input=0x555555559100 <input> "Grüß Gott.
> > Здравствуйте! x=(-b±sqrt(b²-4ac))/(2a)  日本語,中文,한글", input_length=91)
> >     at unistr/test-strncat.h:86
> > #4  main () at test-strncat.c:58
> > 
> > I'm out of my depth in the assembler code.

Reproduced. Changed the `malloc` to mmap and put it at the end of the code.

Issue is:
```
	test	%rdx, %rdx
	jl	L(zero_len)
```

Needs to be `jle` (was `decq` at some point, when changed didn't update flag).
I think the test is actually UB b.c `dst` is not a valid null-terminated string (even though zero length), but will fix.
Comment 4 Noah Goldstein 2023-01-31 20:42:55 UTC
(In reply to Noah Goldstein from comment #3)
> (In reply to Noah Goldstein from comment #2)
> > (In reply to Simon Chopin from comment #0)
> > > I did some tests on master (2f39e44a84) for the upcoming 2.37 release, and I
> > > found a regression in libunistring test suite on amd64 with AVX-2
> > > instructions, more specifically on gnulib's test-strncat. It can be
> > > reproduced using these instructions:
> > > 
> > > https://sourceware.org/glibc/wiki/Testing/Gnulib
> > > 
> > > 
> > > I bisected the issue to
> > > commit 642933158e7cf072d873231b1a9bb03291f2b989
> > > Author: Noah Goldstein <goldstein.w.n@gmail.com>
> > > Date:   Tue Nov 8 17:38:39 2022 -0800
> > > 
> > >     x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
> > >     
> > >     Optimizations are:
> > >         1. Use more overlapping stores to avoid branches.
> > >         2. Reduce how unrolled the aligning copies are (this is more of a
> > >            code-size save, its a negative for some sizes in terms of
> > >            perf).
> > >         3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
> > >            number that are taken.
> > > 
> > > I get the following backtrace:
> > > 
> > > #0  __strncat_avx2 () at ../sysdeps/x86_64/multiarch/strncat-avx2.S:76
> > 
> > Do you know what instruction its segfaulting at?
> > > #1  0x00005555555555d7 in strncat (__len=0, __src=0x7ffff7de4000 "",
> > > __dest=0x55555555c2a1 "") at
> > > /tmp/glibc-dev/include/bits/string_fortified.h:138
> > > #2  check_single (input=input@entry=0x7ffff7de4000 "", n=n@entry=0,
> > > length=90) at unistr/test-strncat.h:41
> > > #3  0x0000555555555352 in check (input=0x555555559100 <input> "Grüß Gott.
> > > Здравствуйте! x=(-b±sqrt(b²-4ac))/(2a)  日本語,中文,한글", input_length=91)
> > >     at unistr/test-strncat.h:86
> > > #4  main () at test-strncat.c:58
> > > 
> > > I'm out of my depth in the assembler code.
> 
> Reproduced. Changed the `malloc` to mmap and put it at the end of the code.
> 
> Issue is:
> ```
> 	test	%rdx, %rdx
> 	jl	L(zero_len)
> ```
> 
> Needs to be `jle` (was `decq` at some point, when changed didn't update
> flag).
> I think the test is actually UB b.c `dst` is not a valid null-terminated
> string (even though zero length), but will fix.
`src` is not a valid null-terminated string*
Comment 5 Andreas Schwab 2023-01-31 21:27:28 UTC
The second argument of strncat is not required to be a string, and the function shall not access more bytes than indicated by the third argument.
Comment 6 Noah Goldstein 2023-01-31 21:29:54 UTC
(In reply to Andreas Schwab from comment #5)
> The second argument of strncat is not required to be a string, and the
> function shall not access more bytes than indicated by the third argument.

You're right, re-read man page. Have fix coming up shortly + test.
Comment 7 Noah Goldstein 2023-01-31 21:37:22 UTC
Bugfix patch posted.
Comment 9 Carlos O'Donell 2023-02-01 02:52:24 UTC
Marking this as fixed for 2.37.