Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
Summary: Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.36
: P2 normal
Target Milestone: 2.37
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-12-07 10:49 UTC by Narayanan Iyer
Modified: 2023-05-14 21:49 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Simple C program that demonstrates the SIG-11 in memcmp-sse2.S (410 bytes, text/x-csrc)
2022-12-07 10:49 UTC, Narayanan Iyer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Narayanan Iyer 2022-12-07 10:49:33 UTC
Created attachment 14486 [details]
Simple C program that demonstrates the SIG-11 in memcmp-sse2.S

Hi,

We saw a occasional test failure after upgrading to Ubuntu 22.10. And after some investigation suspect the cause to be a GLIBC 2.36 regression.

On the system we run the test on, a call to memcmp() ended up invoking sysdeps/x86_64/multiarch/memcmp-sse2.S in glibc. I have attached the /proc/cpuinfo information below in case it helps determine why the sse2 version of memcmp got invoked.

Attached is a simple C program test.c. It spawns 2 threads. Thread 1 does memcmp() in a loop, 100,000 times. The memcmp() will return 0 most of the time since it is coded to compare identical buffers. Thread 2 changes just 1 byte in one of the buffers. It changes this byte to a different value and back again to the original value. Keeps doing this repeatedly. This causes the memcmp() in Thread 1 to alternately return 0 or 1 which is fine. But eventually, Thread 1 crashes with a Segmentation fault.

Below is an example output. In this example, Thread 1 crashed in the 58,234th iteration.

$ gcc test.c
$ ./a.out
.
.
i = 58232 : result = 1
i = 58233 : result = 1
i = 58234 : result = 1
Segmentation fault (core dumped)

Below is the C stack and relevant register output from the crash.

(gdb) where
#0  __memcmp_sse2 () at ../sysdeps/x86_64/multiarch/memcmp-sse2.S:312
#1  0x000055efe46032d3 in childThread ()
#2  0x00007f6f45e90402 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3  0x00007f6f45f1f590 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

(gdb) info registers
rax            0xfffffff1          4294967281
rdx            0xfffffffffffffff1  -15
rsi            0x55efe4606040      94488817066048
rcx            0xffff              65535

I have pasted the memcmp-sse2.S source code with line numbers below corresponding to the flow of the assembly instructions around the time of the crash. That is, the memcmp() function call enters at line 59, executes line 71, 72 and then jumps to line 249 etc. The crash occurs in line 312 below.

sysdeps/x86_64/multiarch/memcmp-sse2.S
--------------------------------------
     59 ENTRY(MEMCMP)
     71         cmpq    $CHAR_PER_VEC, %rdx
     72         ja      L(more_1x_vec)
    249 L(more_1x_vec):
    255         movl    $0xffff, %ecx
    257         movups  (%rsi), %xmm0
    258         movups  (%rdi), %xmm1
    259         PCMPEQ  %xmm0, %xmm1
    260         pmovmskb %xmm1, %eax
    261         subl    %ecx, %eax
    262         jnz     L(ret_nonzero_vec_start_0)
    269         subq    $(CHAR_PER_VEC * 2), %rdx
    271         ja      L(more_2x_vec)
    273         movups  (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rdx, CHAR_SIZE), %xmm0
    274         movups  (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rdx, CHAR_SIZE), %xmm1
    275         PCMPEQ  %xmm0, %xmm1
    276         pmovmskb %xmm1, %eax
    277         subl    %ecx, %eax
    281         jnz     L(ret_nonzero_vec_end_0)
    299 L(ret_nonzero_vec_end_0):
    300         bsfl    %eax, %eax
    311         addl    %edx, %eax
    312         movzbl  (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx

For the above failure to happen, the length of the memcmp() needs to be between 16 and 32.

If the contents of the buffer did not change between lines 59 to 312 above, then we are guaranteed %rax holds a positive value at line 312.

But if the contents can change concurrently (as is the case in the test program), we could end up with a negative value of %rax. And that causes the pointer arithmetic in line 312 to result in a stray pointer instead of one within the memcmp() buffer bounds.

In case it helps, below is the commit that made the above changes.

$ git log --graph --oneline --pretty=format:'%h%d; %ci; %cn; %s' sysdeps/x86_64/multiarch/memcmp-sse2.S | head -1 
* ae308947ff; 2022-07-05 16:42:42 -0700; Noah Goldstein; x86: Add support for building {w}memcmp{eq} with explicit ISA level

I have also added some more information below in case it helps.

$ grep PRETTY_NAME /etc/os-release
PRETTY_NAME="Ubuntu 22.10"

$ uname -a
Linux spencer 5.19.0-26-generic #27-Ubuntu SMP PREEMPT_DYNAMIC Wed Nov 23 20:44:15 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ ldd --version | head -1
ldd (Ubuntu GLIBC 2.36-0ubuntu4) 2.36

$ grep -E "model name|flags" /proc/cpuinfo | sort -u
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc mwaitx cpb hw_pstate ssbd vmmcall fsgsbase bmi1 avx2 smep bmi2 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov
model name      : AMD A12-9800E RADEON R7, 12 COMPUTE CORES 4C+8G

Let me know if you need any more information.

Thanks,
Narayanan.
Comment 1 Andrew Pinski 2022-12-13 18:26:40 UTC
Is this defined code? Because you have a race condition here.
Comment 2 Andrew Pinski 2022-12-13 18:28:40 UTC
I think this code is undefined as there is no memory atomic load/stores. memcmp is not atomic in C11 either.
So changing the memory that memcmp is using is just undefined code.
Comment 3 Narayanan Iyer 2022-12-13 18:33:39 UTC
(In reply to Andrew Pinski from comment #1)
> Is this defined code? Because you have a race condition here.

Yes there is a race condition here. That is intentional and only to demonstrate the underlying issue in memcmp-sse2.S.

The `memcmp()` in one thread can return 1 or 0 depending on how the other thread changes the memory. I am fine with the non-deterministic return value.

But that should never result in a SIG-11 in my understanding.

More importantly this code works fine prior to glibc 2.36. It fails only with glibc 2.36.
Comment 4 Andrew Pinski 2022-12-13 18:39:16 UTC
(In reply to Narayanan Iyer from comment #3) 
> More importantly this code works fine prior to glibc 2.36. It fails only
> with glibc 2.36.

On x86_64. My bet it fails on most other targets and always had failed.
The Seg fault is just saying you are violating atomicity.
It is not only a race condition but the race condition is inside memcmp itself now since memcmp is not atomic for the whole length, it is a byte wise comparison which can be read more than once so you just happen to run into a bug in your code.
Having memcmp being atomicity and reading from the memory only once is ABI thing really and x86_64 ABI does not talk about that. It just happened to work on accident does not mean the behavior should go back to what it was.
Comment 5 Narayanan Iyer 2022-12-13 18:45:32 UTC
(In reply to Andrew Pinski from comment #4)
> (In reply to Narayanan Iyer from comment #3) 
> > More importantly this code works fine prior to glibc 2.36. It fails only
> > with glibc 2.36.
> 
> On x86_64. My bet it fails on most other targets and always had failed.
> The Seg fault is just saying you are violating atomicity.
> It is not only a race condition but the race condition is inside memcmp
> itself now since memcmp is not atomic for the whole length, it is a byte
> wise comparison which can be read more than once so you just happen to run
> into a bug in your code.
> Having memcmp being atomicity and reading from the memory only once is ABI
> thing really and x86_64 ABI does not talk about that. It just happened to
> work on accident does not mean the behavior should go back to what it was.

Seems to me you are saying that `memcmp()` can only be called on memory that is guaranteed to be never changing. And that it should never be called on a shared memory buffer whose contents could be concurrently changing as it goes into undefined behavior territory. That does not sound right to me as we have been using `memcmp()` with shared memory (where multiple processes write to that memory buffer all the time) for the past 25+ years on a variety of architectures and operating systems and have never once seen a SIG-11 in memcmp().
Comment 6 Andrew Pinski 2022-12-13 18:52:37 UTC
(In reply to Narayanan Iyer from comment #5) 
> Seems to me you are saying that `memcmp()` can only be called on memory that
> is guaranteed to be never changing. And that it should never be called on a
> shared memory buffer whose contents could be concurrently changing as it
> goes into undefined behavior territory. That does not sound right to me as
> we have been using `memcmp()` with shared memory (where multiple processes
> write to that memory buffer all the time) for the past 25+ years on a
> variety of architectures and operating systems and have never once seen a
> SIG-11 in memcmp().

That does not mean it is correctly well defined code.
memcmp cannot be used on memory which is going to be changed under its back. since it is not atomic.
Comment 7 Noah Goldstein 2022-12-13 19:13:57 UTC
The fix is basically just:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
```
-#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
+#   define SIZE_OFFSET 0
```

Want to add the other functions / implementations are likely
also affected by a similar bug.
Comment 8 K.S. Bhaskar 2022-12-13 19:36:03 UTC
(In reply to Andrew Pinski from comment #6)
> (In reply to Narayanan Iyer from comment #5) 
> > Seems to me you are saying that `memcmp()` can only be called on memory that
> > is guaranteed to be never changing. And that it should never be called on a
> > shared memory buffer whose contents could be concurrently changing as it
> > goes into undefined behavior territory. That does not sound right to me as
> > we have been using `memcmp()` with shared memory (where multiple processes
> > write to that memory buffer all the time) for the past 25+ years on a
> > variety of architectures and operating systems and have never once seen a
> > SIG-11 in memcmp().
> 
> That does not mean it is correctly well defined code.
> memcmp cannot be used on memory which is going to be changed under its back.
> since it is not atomic.

Since memcmp() is not atomic, it is of course appropriate for the the results of the comparison to have a race condition. However, since the addresses by which a process has that memory mapped don't change, a SIG-11 should never occur. Even if all other processes which have that memory mapped terminate while the memcmp() is running in a process, that process will not unmap the memory and it will remain valid (assuming it was valid to start with).
Comment 9 Noah Goldstein 2022-12-13 20:09:51 UTC
(In reply to K.S. Bhaskar from comment #8)
> (In reply to Andrew Pinski from comment #6)
> > (In reply to Narayanan Iyer from comment #5) 
> > > Seems to me you are saying that `memcmp()` can only be called on memory that
> > > is guaranteed to be never changing. And that it should never be called on a
> > > shared memory buffer whose contents could be concurrently changing as it
> > > goes into undefined behavior territory. That does not sound right to me as
> > > we have been using `memcmp()` with shared memory (where multiple processes
> > > write to that memory buffer all the time) for the past 25+ years on a
> > > variety of architectures and operating systems and have never once seen a
> > > SIG-11 in memcmp().
> > 
> > That does not mean it is correctly well defined code.
> > memcmp cannot be used on memory which is going to be changed under its back.
> > since it is not atomic.
> 
> Since memcmp() is not atomic, it is of course appropriate for the the
> results of the comparison to have a race condition. However, since the
> addresses by which a process has that memory mapped don't change, a SIG-11
> should never occur. Even if all other processes which have that memory
> mapped terminate while the memcmp() is running in a process, that process
> will not unmap the memory and it will remain valid (assuming it was valid to
> start with).

memcmp() is incorrect if the values change from under it. How that
incorrectness will manifest is completely undefined.
Comment 10 Noah Goldstein 2022-12-13 20:18:07 UTC
(In reply to Noah Goldstein from comment #9)
> (In reply to K.S. Bhaskar from comment #8)
> > (In reply to Andrew Pinski from comment #6)
> > > (In reply to Narayanan Iyer from comment #5) 
> > > > Seems to me you are saying that `memcmp()` can only be called on memory that
> > > > is guaranteed to be never changing. And that it should never be called on a
> > > > shared memory buffer whose contents could be concurrently changing as it
> > > > goes into undefined behavior territory. That does not sound right to me as
> > > > we have been using `memcmp()` with shared memory (where multiple processes
> > > > write to that memory buffer all the time) for the past 25+ years on a
> > > > variety of architectures and operating systems and have never once seen a
> > > > SIG-11 in memcmp().
> > > 
> > > That does not mean it is correctly well defined code.
> > > memcmp cannot be used on memory which is going to be changed under its back.
> > > since it is not atomic.
> > 
> > Since memcmp() is not atomic, it is of course appropriate for the the
> > results of the comparison to have a race condition. However, since the
> > addresses by which a process has that memory mapped don't change, a SIG-11
> > should never occur. Even if all other processes which have that memory
> > mapped terminate while the memcmp() is running in a process, that process
> > will not unmap the memory and it will remain valid (assuming it was valid to
> > start with).
> 
> memcmp() is incorrect if the values change from under it. How that
> incorrectness will manifest is completely undefined.

Out of curiosity what is the use-case that causes this to happen? Is it
idiomatic in some way?

I'm unhappy that the code causes a SIG-11 in this case, but changing
it to harden against this case is not free for correct usage performance
so I'm not really convinced it's worth supporting this buggy usage unless
this bug is common to some idiomatic schema (although even then, I'm not
sure whether we could harden the entire string/mem library to that
usage or even what changes that would imply).
Comment 11 Narayanan Iyer 2022-12-13 20:46:01 UTC
(In reply to Noah Goldstein from comment #10)

> Out of curiosity what is the use-case that causes this to happen? Is it
> idiomatic in some way?
> 
> I'm unhappy that the code causes a SIG-11 in this case, but changing
> it to harden against this case is not free for correct usage performance
> so I'm not really convinced it's worth supporting this buggy usage unless
> this bug is common to some idiomatic schema (although even then, I'm not
> sure whether we could harden the entire string/mem library to that
> usage or even what changes that would imply).

We implement a database called YottaDB. It uses optimistic concurrency control (also known as OCC) to implement transaction processing. See https://en.wikipedia.org/wiki/Optimistic_concurrency_control for more details.

Every process that is in a transaction reads database values (from shared memory) without holding a lock and tentatively prepare the needed changes in private memory. To do this, it needs to examine shared memory contents and it is here that we do the `memcmp()` calls which ended up with a SIG-11.

At the end of the transaction, we grab a lock and see if anything changed since we started the transaction and if not we proceed to commit. If we notice something changed, we restart the transaction. We minimize the use of locks using this approach to get faster transaction throughput.

As I had mentioned in a previous comment, this model has been in use by us for the past 25+ years in various architectures (x86_64, i386, RS6000, Alpha, VAX, Sparc, HPPA etc.) and operating systems (Linux, AIX, Tru64, Solaris, HPUX etc.) and has never caused a SIG-11 until now (i.e. glibc 2.36).

As you can see in the `Examples` section of the wikipedia link I mention above, there are a lot of databases that use OCC. It is likely all of them use similar techniques and are affected as well.
Comment 12 K.S. Bhaskar 2022-12-13 21:53:32 UTC
While I can't speak to the other databases, the YottaDB code base goes back much further than 25 years. Under the name GT.M, it first saw live production use in 1986. Its use of optimistic concurrency control was inspired by Kung & Robinson's seminal paper (https://www.eecs.harvard.edu/~htk/publication/1981-tods-kung-robinson.pdf). I personally know the original author of the code, as well as someone who has worked on the code base since the mid 1980s. In addition to those mentioned by Narayanan, the code has over the decades run on several other CPU architectures and operating systems.

I hope this suffices as a use case.
Comment 13 Noah Goldstein 2022-12-13 23:01:45 UTC
(In reply to K.S. Bhaskar from comment #12)
> While I can't speak to the other databases, the YottaDB code base goes back
> much further than 25 years. Under the name GT.M, it first saw live
> production use in 1986. Its use of optimistic concurrency control was
> inspired by Kung & Robinson's seminal paper
> (https://www.eecs.harvard.edu/~htk/publication/1981-tods-kung-robinson.pdf).
> I personally know the original author of the code, as well as someone who
> has worked on the code base since the mid 1980s. In addition to those
> mentioned by Narayanan, the code has over the decades run on several other
> CPU architectures and operating systems.
> 
> I hope this suffices as a use case.

Can you test the following dif:
```
@@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0):
  setg %dl
  leal -1(%rdx, %rdx), %eax
 #  else
- addl %edx, %eax
+ addq %rdx, %rax
  movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx
  movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax
  subl %ecx, %eax

```

If that fixes things I'm okay making the change.

I'm opposed to explicitly supporting it, but am happy
to try and make this failure from the unsupported
usage less dramatic.
Comment 14 Narayanan Iyer 2022-12-13 23:16:01 UTC
(In reply to Noah Goldstein from comment #13)
> 
> Can you test the following dif:
> ```
> @@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0):
>   setg %dl
>   leal -1(%rdx, %rdx), %eax
>  #  else
> - addl %edx, %eax
> + addq %rdx, %rax
>   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx
>   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax
>   subl %ecx, %eax
> 
> ```
> 
> If that fixes things I'm okay making the change.
> 
> I'm opposed to explicitly supporting it, but am happy
> to try and make this failure from the unsupported
> usage less dramatic.

This change sounds much better.

Is there a url that describes how to build glibc from source and use it in my application?
Comment 15 Noah Goldstein 2022-12-13 23:39:33 UTC
(In reply to Narayanan Iyer from comment #14)
> (In reply to Noah Goldstein from comment #13)
> > 
> > Can you test the following dif:
> > ```
> > @@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0):
> >   setg %dl
> >   leal -1(%rdx, %rdx), %eax
> >  #  else
> > - addl %edx, %eax
> > + addq %rdx, %rax
> >   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx
> >   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax
> >   subl %ecx, %eax
> > 
> > ```
> > 
> > If that fixes things I'm okay making the change.
> > 
> > I'm opposed to explicitly supporting it, but am happy
> > to try and make this failure from the unsupported
> > usage less dramatic.
> 
> This change sounds much better.
> 
> Is there a url that describes how to build glibc from source and use it in
> my application?

I just built the current/fixed memcmp into a single library and used
LD_PRELOAD.

This fix seems to keep it from segfaulting although I wouldn't be
surprised if there are some edge cases in other impls / paths
around page boundaries where it can still segfault. 

I'll submit a patch with these changes although I'm not sure
its going to get backported.
Comment 16 Noah Goldstein 2022-12-14 00:14:57 UTC
(In reply to Narayanan Iyer from comment #14)
> (In reply to Noah Goldstein from comment #13)
> > 
> > Can you test the following dif:
> > ```
> > @@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0):
> >   setg %dl
> >   leal -1(%rdx, %rdx), %eax
> >  #  else
> > - addl %edx, %eax
> > + addq %rdx, %rax
> >   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx
> >   movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax
> >   subl %ecx, %eax
> > 
> > ```
> > 
> > If that fixes things I'm okay making the change.
> > 
> > I'm opposed to explicitly supporting it, but am happy
> > to try and make this failure from the unsupported
> > usage less dramatic.
> 
> This change sounds much better.
> 
> Is there a url that describes how to build glibc from source and use it in
> my application?

Posted a patch which should fix the issue.
Comment 17 Narayanan Iyer 2022-12-14 15:40:02 UTC
(In reply to Noah Goldstein from comment #16)
> 
> Posted a patch which should fix the issue.

Thank you very much for the prompt fix.

I will try to verify this in my system using the LD_PRELOAD trick you mentioned.
Comment 18 Narayanan Iyer 2022-12-14 18:17:21 UTC
(In reply to Narayanan Iyer from comment #17)
> 
> Thank you very much for the prompt fix.
> 
> I will try to verify this in my system using the LD_PRELOAD trick you
> mentioned.

I tried compiling just memcmp-sse2.S but encountered issues. So went ahead and rebuilt the full glibc without and with your fix. And verified that without your fix, the test program crashed reliably whereas with your fix the test program ran fine all the time.
Comment 19 Paul Pluzhnikov 2023-05-14 21:46:59 UTC
Fixed by:

Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Wed Dec 14 10:52:10 2022 -0800

    x86: Prevent SIGSEGV in memcmp-sse2 when data is concurrently modified [BZ #29863]
Comment 20 Paul Pluzhnikov 2023-05-14 21:48:54 UTC
(In reply to Paul Pluzhnikov from comment #19)
> Fixed by:
> 

commit b712be52645282c706a5faa038242504feb06db5

> Author: Noah Goldstein <goldstein.w.n@gmail.com>
> Date:   Wed Dec 14 10:52:10 2022 -0800
> 
>     x86: Prevent SIGSEGV in memcmp-sse2 when data is concurrently modified
> [BZ #29863]