Bug 26867 - FAIL: gdb.threads/signal-sigtrap.exp: sigtrap thread 1: signal SIGTRAP reaches handler
Summary: FAIL: gdb.threads/signal-sigtrap.exp: sigtrap thread 1: signal SIGTRAP reache...
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: testsuite (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 16.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 28347 28468 31602 (view as bug list)
Depends on:
Blocks:
 
Reported: 2020-11-11 16:30 UTC by Tom de Vries
Modified: 2024-09-23 07:55 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2020-11-11 16:30:46 UTC
I ran into this FAIL:
...
(gdb) PASS: gdb.threads/signal-sigtrap.exp: sigtrap thread 1: switch to sigtrap thread
signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
^M
Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
0x00007ffff7fabc2a in pthread_create@@GLIBC_2.2.5 () from /lib64/libpthread.so.0^M
(gdb) FAIL: gdb.threads/signal-sigtrap.exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

I managed to reproduce it on the command line, like once every couple of tries.

This is on openSUSE Tumbleweed, with glibc 2.32.

I tried the same on openSUSE Leap 15.2 with glibc 2.26, but it doesn't reproduce there.
Comment 1 Tom de Vries 2020-11-11 16:44:41 UTC
(In reply to Tom de Vries from comment #0)
> I managed to reproduce it on the command line, like once every couple of
> tries.
> 

To be more specific, in conjunction with stress -c 5 on a dual-core/four-thread cpu:
...
$ for n in $(seq 1 10); do echo -n "$n: "; gdb -batch ./outputs/gdb.threads/signal-sigtrap/signal-sigtrap -ex "break thread_function" -ex run -ex "info threads" -ex "break sigtrap_handler" -ex "thread 1" -ex "signal SIGTRAP" 2>&1 | egrep "hit Breakpoint 2| received signal"; done
1: Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.
2: Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.
3: Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5) at /data/gdb_versions/devel/src/gdb/testsuite/gdb.threads/signal-sigtrap.c:26
4: Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5) at /data/gdb_versions/devel/src/gdb/testsuite/gdb.threads/signal-sigtrap.c:26
5: Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5) at /data/gdb_versions/devel/src/gdb/testsuite/gdb.threads/signal-sigtrap.c:26
6: Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.
7: Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.
8: Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5) at /data/gdb_versions/devel/src/gdb/testsuite/gdb.threads/signal-sigtrap.c:26
9: Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5) at /data/gdb_versions/devel/src/gdb/testsuite/gdb.threads/signal-sigtrap.c:26
10: Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.
...
Comment 2 Tom de Vries 2020-11-12 14:08:08 UTC
A very similar failure occurs here:
...
(gdb) PASS: gdb.threads/signal-command-handle-nopass.exp: step-over yes: thread 1 selected
signal SIGUSR1^M
Continuing with signal SIGUSR1.^M
^M
Thread 1 "signal-command-" received signal SIGUSR1, User defined signal 1.^M
0x00007ffff7fabc2a in pthread_create@@GLIBC_2.2.5 () from /lib64/libpthread.so.0^M
(gdb) FAIL: gdb.threads/signal-command-handle-nopass.exp: step-over yes: signal SIGUSR1
...
Comment 4 Tom de Vries 2024-08-29 08:38:44 UTC
*** Bug 28347 has been marked as a duplicate of this bug. ***
Comment 5 Tom de Vries 2024-08-29 13:49:58 UTC
*** Bug 28468 has been marked as a duplicate of this bug. ***
Comment 6 Tom de Vries 2024-09-06 13:35:02 UTC
*** Bug 31602 has been marked as a duplicate of this bug. ***
Comment 7 Sourceware Commits 2024-09-23 07:53:51 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=dca18cb6a100fa87d6478a465e48ddc1d9ed943a

commit dca18cb6a100fa87d6478a465e48ddc1d9ed943a
Author: Tom de Vries <tdevries@suse.de>
Date:   Mon Sep 23 09:53:54 2024 +0200

    [gdb/testsuite] Fix failure in gdb.threads/signal-sigtrap.exp
    
    The test-case gdb.threads/signal-sigtrap.exp:
    - installs a signal handler called sigtrap_handler for SIGTRAP,
    - sets a breakpoint on sigtrap_handler, and
    - expects the breakpoint to trigger after issuing "signal SIGTRAP".
    
    Usually, that happens indeed:
    ...
    (gdb) signal SIGTRAP^M
    Continuing with signal SIGTRAP.^M
    ^M
    Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5)^M
    28      }^M
    (gdb) PASS: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
    ...
    
    Occasionally, I run into this failure on openSUSE Tumbleweed:
    ...
    (gdb) signal SIGTRAP^M
    Continuing with signal SIGTRAP.^M
    ^M
    Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
    __pthread_create_2_1 () at pthread_create.c:843^M
    (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
    ...
    
    AFAIU, the problem is in the situation that is setup before issuing that
    command, by running to a breakpoint in thread_function:
    ...
    void *thread_function (void *arg) {
      return NULL;
    }
    int main (void) {
      pthread_t child_thread;
      signal (SIGTRAP, sigtrap_handler);
      pthread_create (&child_thread, NULL, thread_function, NULL);
      pthread_join (child_thread, NULL);
      return 0;
    }
    ...
    
    In the passing case, thread 2 is stopped in thread_function, and thread 1 is
    stopped somewhere in pthread_join:
    ...
    (gdb) info threads^M
      Id   Target Id                                          Frame ^M
      1    Thread ... (LWP ...) "signal-sigtrap" __futex_abstimed_wait_common64 ()
    * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
    ...
    
    In the failing case, thread 2 is stopped in thread_function, but thread 1 is
    stopped somewhere in pthread_create:
    ...
    (gdb) info threads^M
      Id   Target Id                                          Frame ^M
      1    Thread ... (LWP ...) "signal-sigtrap" __GI___clone3 ()
    * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
    ...
    
    What I think happens is that pthread_create blocks SIGTRAP at some point, and
    if the "signal SIGTRAP" command is issued while that is the case, the signal
    becomes pending and consequently there's no longer a guarantee that the signal
    will be delivered to the inferior.
    
    Instead the signal will be handled by gdb like this:
    ...
    (gdb) info signals SIGTRAP
    Signal        Stop      Print   Pass to program Description
    SIGTRAP       Yes       Yes     No              Trace/breakpoint trap
    ...
    
    Fix this by adding a barrier that ensures that pthread_create is done before
    we issue the "signal SIGTRAP" command.
    
    Likewise in test-case gdb.threads/signal-command-handle-nopass.exp.
    
    Using the fixed test-case, I tested my theory by explicitly blocking SIGTRAP:
    ...
    +  sigset_t old_ss, new_ss;
    +  sigemptyset (&new_ss);
    +  sigaddset (&new_ss, SIGTRAP);
    +  sigprocmask (SIG_BLOCK, &new_ss, &old_ss);
    +
       /* Make sure that pthread_create is done once the breakpoint on
          thread_function triggers.  */
       pthread_barrier_wait (&barrier);
    
       pthread_join (child_thread, NULL);
    +  sigprocmask (SIG_SETMASK, &old_ss, NULL);
    ...
    and managed to reproduce the same failure:
    ...
    (gdb) signal SIGTRAP^M
    Continuing with signal SIGTRAP.^M
    [Thread 0x7ffff7c00700 (LWP 13254) exited]^M
    ^M
    Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
    0x00007ffff7c80056 in __GI___sigprocmask () sigprocmask.c:39^M
    (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
    ...
    
    Tested on x86_64-linux.
    
    PR testsuite/26867
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26867
Comment 8 Tom de Vries 2024-09-23 07:55:04 UTC
Fixed.