Bug 30387 - gdbserver assert error on arm platform
Summary: gdbserver assert error on arm platform
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: server (show other bugs)
Version: HEAD
: P2 critical
Target Milestone: 14.1
Assignee: Kevin Buettner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-04-25 06:43 UTC by Yan, Zhiyong
Modified: 2023-08-12 04:02 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2023-05-02 00:00:00


Attachments
This app can produce gdbserver assert error on arm platform (1.99 KB, application/x-tar)
2023-04-25 06:43 UTC, Yan, Zhiyong
Details
gdb test script (223 bytes, text/plain)
2023-04-25 06:55 UTC, Yan, Zhiyong
Details
patch file for gdbserver/linux-low.cc (658 bytes, application/mbox)
2023-04-25 07:01 UTC, Yan, Zhiyong
Details
patch review mail is attached. (157.69 KB, application/x-ole-storage)
2023-05-05 02:20 UTC, Yan, Zhiyong
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yan, Zhiyong 2023-04-25 06:43:52 UTC
Created attachment 14848 [details]
This app can produce gdbserver assert error on arm platform

Hi,
   Attached gdbserver-test-app.tar can produce gdbserver assert error on arm platform. This issue happens on gdb any version until last git tree master branch. Below is version and error message. I will add produce step specification in next comment.

Best Regards.
Zhiyong
   

root@xilinx-zynq:~# gdb -v
GNU gdb (GDB) 14.0.50.20230421-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
root@xilinx-zynq:~#


root@xilinx-zynq:~# gdbserver --version
GNU gdbserver (GDB) 14.0.50.20230421-git
Copyright (C) 2023 Free Software Foundation, Inc.
gdbserver is free software, covered by the GNU General Public License.
This gdbserver was configured as "arm-wrs-linux-gnueabi"


2389.242477   [threads] resume_one_lwp_throw: Resuming lwp 448 (continue, signal 0, stop not expected)
../../git/gdbserver/linux-low.cc:2448: A problem internal to GDBserver has been detected.
maybe_hw_step: Assertion `has_single_step_breakpoints (thread)' failed.
Aborted
Comment 1 Yan, Zhiyong 2023-04-25 06:54:36 UTC
Produce step
(1) tar xvf gdbserver-test-app.tar on a host which can do arm-cross compile.
(2) In osm.service, modify ExecStart path according to your running environment.
(3) make
(4) Please refer to "make install", install osm systemd service on target board.


[On target board]
systemctl daemon-reload
systemctl start osm
gdbserver --debug --debug-format=all --remote-debug --event-loop-debug --once --attach :1234 $(pgrep osm)

[On pc host]
your-arm-gdb ./osm(this is test app build out as above)  -x ~/gdbx2

gdbx2 can be found in the attachment, please modify target-remote pointing to your  target board's gdbserver in gdbx2.

When gdb executes gdbx2, gdbserver will assert on target board.
Comment 2 Yan, Zhiyong 2023-04-25 06:55:32 UTC
Created attachment 14849 [details]
gdb test script
Comment 3 Yan, Zhiyong 2023-04-25 07:01:38 UTC
Created attachment 14850 [details]
patch file for gdbserver/linux-low.cc
Comment 4 Yan, Zhiyong 2023-04-25 07:02:32 UTC
After patch 0001-arm-Install-single-step-software-breakpoing.patch, gdb server doesn't assert any more.
Comment 5 Luis Machado 2023-05-02 07:11:35 UTC
Thanks for the report. Let me try to reproduce this one.
Comment 6 Luis Machado 2023-05-02 14:17:40 UTC
Confirmed. I managed to reproduce this. Might be related to targets with software single stepping and forks happening during stepping.

Sometimes you can step a few more times than the reproducer. Other times it happens just like the reproducer.

Would you mind sending the patch upstream (gdb-patches@sourceware.org) for further discussion?
Comment 7 Yan, Zhiyong 2023-05-04 09:17:41 UTC
Hi Luis,
   I already sent patch file at https://savannah.gnu.org/patch/index.php?10337.
   Do you mean I must send the patch to gdb-patches@sourceware.org by mail ? What information should I provide in mail ?

Best Regards
Zhiyong
Comment 8 Luis Machado 2023-05-04 10:45:50 UTC
Yes, the patches should be sent by e-mail to the gdb-patches@sourceware.org mailing list, where it will go through review / discussion / approval.
Comment 9 Yan, Zhiyong 2023-05-05 02:20:31 UTC
Created attachment 14862 [details]
patch review mail is attached.

patch review mail is attached.
Comment 10 Yan, Zhiyong 2023-05-05 02:22:38 UTC
Hi Luis,
   I have sent patch file and debug log with a brief analyze to gdb-patches@sourceware.org.

   I also attached this mail in this PR.

Best Regards.
Zhiyong
Comment 11 Yan, Zhiyong 2023-05-10 07:41:13 UTC
Hi Louis,
   I sent patch to gdb-patches@sourceware.org 5 days ago. But I have not received reply. Do you know how it is going ?

Best Regards.
Zhiyong
Comment 12 Luis Machado 2023-05-10 07:52:08 UTC
Hi. There isn't an ETA for when it will get reviewed, so a little patience is required. Given it is a change to generic code, the global maintainers need to go through it to make sure it is suitable. I'm still giving this a try to understand if the patch is the right way to go.

I haven't forgotten about it.
Comment 13 Luis Machado 2023-05-10 07:52:32 UTC
I wonder if this also reproduces on x86. If so, that would make a better case for urgency of fixing this.
Comment 14 Yan, Zhiyong 2023-05-10 09:16:16 UTC
(In reply to Luis Machado from comment #13)
> I wonder if this also reproduces on x86. If so, that would make a better
> case for urgency of fixing this.

This issue can't be produced on x86. I think it is because x86 supports hardware breakpoint. If make gdbserver use software breakpoint on x86, this issue can be produced.

I am not urgent for merging the patch to upstream, I just try make the patch proceed in formal process. As out customer hopes the fixes can be carried by gdb formal release.
Comment 15 Simon Marchi 2023-05-11 01:56:16 UTC
(In reply to Yan, Zhiyong from comment #14)
> (In reply to Luis Machado from comment #13)
> > I wonder if this also reproduces on x86. If so, that would make a better
> > case for urgency of fixing this.
> 
> This issue can't be produced on x86. I think it is because x86 supports
> hardware breakpoint. If make gdbserver use software breakpoint on x86, this
> issue can be produced.

I think you mean hardware single stepping, not hardware breakpoint.  GDB uses software breakpoints for x86 as well as ARM.  But GDB uses hardware single stepping for x86, and software single stepping for ARM (meaning it inserts a breakpoint at the next instruction(s) and resumes when it wants to single step).
Comment 16 Yan, Zhiyong 2023-05-11 02:50:11 UTC
(In reply to Simon Marchi from comment #15)
> (In reply to Yan, Zhiyong from comment #14)
> > (In reply to Luis Machado from comment #13)
> > > I wonder if this also reproduces on x86. If so, that would make a better
> > > case for urgency of fixing this.
> > 
> > This issue can't be produced on x86. I think it is because x86 supports
> > hardware breakpoint. If make gdbserver use software breakpoint on x86, this
> > issue can be produced.
> 
> I think you mean hardware single stepping, not hardware breakpoint.  GDB
> uses software breakpoints for x86 as well as ARM.  But GDB uses hardware
> single stepping for x86, and software single stepping for ARM (meaning it
> inserts a breakpoint at the next instruction(s) and resumes when it wants to
> single step).

Yes. It is supports_hardware_single_step not hardware breakpoint. I made a mistake.
Comment 17 Joel Brobecker 2023-07-05 14:25:13 UTC
hi guys,

What's the status of this PR?

And do you guys confirm that this PR is blocking for a GDB 14.1 release? If yes, can you explain what the rationale for it is? (impact, is it a regression, etc).

Thank you!
Comment 18 Luis Machado 2023-07-05 14:33:38 UTC
Hi,

The internal error blocks further debugging. It doesn't always happen, but happens reliably enough that we should fix it.

Yan Zhiyong sent the patch to gdb-patches and is waiting for feedback. It is a change to generic code, so it would require one of the global maintainers to OK it.
Comment 19 Joel Brobecker 2023-07-05 15:21:01 UTC
Thanks for explaining the situation, Luis. So let's indeed leave things as they are, then.
Comment 20 Kevin Buettner 2023-08-08 02:59:55 UTC
The most recent patch & gdb test case can be found here:

https://sourceware.org/pipermail/gdb-patches/2023-August/201314.html

I plan to wait several more days for comments prior to pushing it.
Comment 21 Luis Machado 2023-08-11 16:45:34 UTC
Hi,

Thanks for the patch.

A few comments.

When running this test on a 32-bit docker instance in 64-bit hardware, against the native-gdbserver board, I'm seeing FAIL's:

# of expected passes            3074
# of unexpected failures        13

The FAIL's are of this kind:

FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=on: non-stop=off: displaced-stepping=off: i=0: next to other line

They fail for both a patched and an unpatched gdb. But I see 12 unexpected core files for the unpatched gdb.

It passes for both patched and unpatched native gdb though. So it might not be exercising the bug there.

Also, I gave the testcase a try, and I noticed it takes a reasonably long time to run, both on 32-bit and 64-bit.

Is it a timing issue of some kind that we need a lot of iterations to run to get it to show up?

From the testcase description, it looks like this is mostly a software single-step issue and on gdbserver. Should we isolate the tests to gdbserver and known targets using software single-step instead of making all targets run the test only to potentially PASS every time?

So, in summary:

* For native gdb (32-bit or 64-bit), this doesn't seem to be exercising the bug.

* For native-gdbserver on 64-bit, it doesn't seem to exercise the bug.

* For native-gdbserver on 32-bit, I see unexpected core files, which indicates it does exercise the bug, but I also see unexpected failures, which may be something off with the testcase patterns.
Comment 22 Kevin Buettner 2023-08-11 17:33:27 UTC
(In reply to Luis Machado from comment #21)

Hi Luis,

Regarding the length of time that it takes to run this test case: 
I've seen some instances where it took in excess of 100 iterations to
observe the software single-step related failure.  The number of
iterations is currently set to 200, but we might drop it to 150, which
should decrease the time needed to run the test by about 25%.  We
could also eliminate one or more of the outer loops, but this would
decrease coverage of (e.g.) the target-non-stop=on case.  Doing so,
however, would likely eliminate the (other) failures that you've
observed.

Regarding whether or not to run the test for native targets: It
is true that this test was written after finding a bug in the software
single-step related code in gdbserver.  However, it seems to me that
it'd be possible to introduce code causing buggy behavior for native
gdb, whether it requires software single-step or not.  Therefore, I'd
prefer to run it for all targets.

Regarding unexpected failures for a patched gdbserver:  I see these
too, and do not know why they occur.  I've only seen them for the
"target-non-stop=on" case(s), and they do seem to be somewhat racy in
nature.  I recall one test run where there were no failures, but I
usually see 2 when testing on a Raspberry Pi.  I think it's likely
that there's another bug that needs fixing.  We could setup a KFAIL
for that particular case.  Or, if we were to eliminate testing
target-non-stop=on, those failures would also go away.  (I didn't
set up a KFAIL for it because leaving them as FAIL means that someone
is more likely to try to fix the bug...)
Comment 23 Luis Machado 2023-08-11 17:59:00 UTC
Kevin,

Thanks for the info. As long as the tests are covering/testing possible breakages, that sounds fine to me then.

My take on it is that non-stop mode causes outputs to be somewhat chaotic in their order, so it is hard to account for those in a deterministic way. Well, at least with the current infrastructure.
Comment 24 Kevin Buettner 2023-08-12 02:38:40 UTC
I did some testing and found that with the loop count (for the 'next'
commands) set to 200, it would take my Raspberry Pi, using
--target_board=native-gdbserver, nearly seven minutes to run the new
test, gdb.threads/next-fork-exec-other-thread.exp.  My macbook running
F38 would take nearly a minute and a half while an x86-64 VM also
running F38 would run the test in a little under a minute.

After a bunch of testing, I settled on changing that loop count to
30.  This would still reliably reproduce the bug that Zhiyong had
reported, but also finished considerably more quickly.  The Raspberry
Pi would finish in under a minute and a half while the macbook and
the x86-64 VM would finish in around 15 seconds. Native testing for
these targets completes in less than 10 seconds.

Therefore, in the interest of not causing overall testing to slow
down too much, I've reduced the loop count from 200 to 30.

My complete findings are below - stop here for the TLDR version!...

Command used for native-gdbserver:

time make check RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.threads/next-fork-exec-other-thread.exp

Command used for native:

time make check RUNTESTFLAGS="" TESTS=gdb.threads/next-fork-exec-other-thread.exp

200 iterations:

rpi, unpatched, native-gdbserver: 2m51.976s (14 failures reported)
rpi, patched, native-gdbserver:   6m48.753s (2 failures reported)
rpi, unpatched, native:           2m11.673s 
rpi, patched, native:             2m9.436s

macbook, unpatched, native-gdbserver: 1m24.372s
macbook, patched, native-gdbserver:   1m26.793s
macbook, unpatched, native:           0m17.314s
macbook, patched, native:             0m18.017s

f38-1, unpatched, native-gdbserver: 0m55.265s
f38-1, patched, native-gdbserver:   0m52.767s
f38-1, unpatched, native:           0m23.419s
f38-1, patched, native:             0m23.119s

150 iterations:

rpi, unpatched, native-gdbserver: 2m31.826s (13 failures reported)
rpi, patched, native-gdbserver:   5m3.467s (2 failures reported)
rpi, unpatched, native:           1m43.856s
rpi, patched, native:             1m41.656s

macbook, unpatched, native-gdbserver: 1m9.472s
macbook, patched, native-gdbserver:   1m13.066s
macbook, unpatched, native:           0m13.646s
macbook, patched, native:             0m13.460s

f38-1, unpatched, native-gdbserver: 0m42.931s
f38-1, patched, native-gdbserver:   0m42.484s
f38-1, unpatched, native:           0m20.117s
f38-1, patched, native:             0m19.720s

100 iterations:

rpi, unpatched, native-gdbserver: 2m4.937s (13 failures reported)
rpi, patched, native-gdbserver:   3m56.154s (1 failure reported)
rpi, unpatched, native:           1m16.185s
rpi, patched, native:             1m14.161s

macbook, unpatched, native-gdbserver: 0m44.304s
macbook, patched, native-gdbserver:   0m41.998s
macbook, unpatched, native:           0m9.782s
macbook, patched, native:             0m10.400s

f38-1, unpatched, native-gdbserver: 0m30.188s
f38-1, patched, native-gdbserver:   0m30.122s
f38-1, unpatched, native:           0m15.375s
f38-1, patched, native:             0m15.306s

50 iterations:

rpi, unpatched, native-gdbserver: 1m22.541s (13 failures reported)
rpi, patched, native-gdbserver:   1m4.468s ( 1 failure reported)
rpi, unpatched, native:           0m48.767s
rpi, patched, native:             0m46.831s

macbook, unpatched, native-gdbserver: 0m25.266s
macbook, patched, native-gdbserver:   0m24.684s
macbook, unpatched, native:           0m6.302s
macbook, patched, native:             0m6.542s

f38-1, unpatched, native-gdbserver: 0m19.392s
f38-1, patched, native-gdbserver:   0m19.449s
f38-1, unpatched, native:           0m11.191s
f38-1, patched, native:             0m11.409s

30 iterations:

rpi, unpatched, native-gdbserver: 1m3.633s (12 failures reported)
rpi, patched, native-gdbserver:   1m27.072s (0 failures reported!)
rpi, unpatched, native:           0m37.846s
rpi, patched, native:             0m35.950s

macbook, unpatched, native-gdbserver: 0m14.870s
macbook, patched, native-gdbserver:   0m14.537s
macbook, unpatched, native:           0m4.605s
macbook, patched, native:             0m4.770s

f38-1, unpatched, native-gdbserver: 0m15.249s
f38-1, patched, native-gdbserver:   0m14.830s
f38-1, unpatched, native:           0m9.762s
f38-1, patched, native:             0m9.674s

20 iterations:

rpi, unpatched, native-gdbserver: 0m53.582s (11 failures reported)
rpi, patched, native-gdbserver:   1m5.585s (0 failures reported)
rpi, unpatched, native:           0m32.360s
rpi, patched, native:             0m30.554s

macbook, unpatched, native-gdbserver: 0m10.432s
macbook, patched, native-gdbserver:   0m10.776s
macbook, unpatched, native:           0m4.029s
macbook, patched, native:             0m4.189s

f38-1, unpatched, native-gdbserver: 0m12.492s
f38-1, patched, native-gdbserver:   0m12.477s
f38-1, unpatched, native:           0m8.801s
f38-1, patched, native:             0m8.729s

Back to 30 iterations, only rpi w/ native-gdbserver, multiple runs:

rpi, unpatched, native-gdbserver:

0m51.597s	: 13 failures, 12 core files
0m54.998s	: 13 failures, 12 core files
1m0.335s	: 12 failures, 12 core files
0m54.722s	: 12 failures, 11 core files
0m55.992s	: 12 failures, 12 core files

rpi, patched, native-gdbserver:

1m27.186s	: no failures
1m27.660s	: no failures
1m28.207s	: no failures
1m26.833s	: no failures
1m27.291s	: no failures

But note that no failures noted above doesn't mean that there isn't a
problem!  We just haven't iterated through enough GDB 'next' commands
to see it.
Comment 25 Sourceware Commits 2023-08-12 03:54:29 UTC
The master branch has been updated by Kevin Buettner <kevinb@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b6d8d612d30dcdfc8ba8edfb15b4cd1753b0b8a2

commit b6d8d612d30dcdfc8ba8edfb15b4cd1753b0b8a2
Author: Kevin Buettner <kevinb@redhat.com>
Date:   Tue Aug 1 13:33:24 2023 -0700

    gdbserver: Reinstall software single-step breakpoints in resume_stopped_resumed_lwps
    
    At the moment, while performing a software single-step, gdbserver fails
    to reinsert software single-step breakpoints for a LWP when
    interrupted by a signal in another thread.  This commit fixes this
    problem by reinstalling software single-step breakpoints in
    linux_process_target::resume_stopped_resumed_lwps in
    gdbserver/linux-low.cc.
    
    This bug was discovered due to a failing assert in maybe_hw_step()
    in gdbserver/linux-low.cc.  Looking at the backtrace revealed
    that the caller was linux_process_target::resume_stopped_resumed_lwps.
    I was uncertain whether the assert should still be valid when called
    from that method, so I tried hoisting the assert from maybe_hw_step
    to all callers except resume_stopped_resumed_lwps.  But running the
    new test case, described below, showed that merely eliminating the
    assert for this case was NOT a good fix - a study of the log file for
    the test showed that the single-step operation failed to occur.
    Instead GDB (via gdbserver) stopped at the next breakpoint that was
    hit.
    
    Zhiyong Yan had proposed a fix which resinserted software single-step
    breakpoints, albeit at a different location in linux-low.cc.  Testing
    revealed that, while running gdb.threads/pending-fork-event-detach,
    the executable associated with that test would die due to a SIGTRAP
    after the test program was detached.  Examination of the core file(s)
    showed that a breakpoint instruction had been left in program memory.
    Test results were otherwise very good, so Zhiyong was definitely on
    the right track!
    
    This commit causes software single-step breakpoint(s) to be inserted
    before the call to maybe_hw_step in resume_stopped_resumed_lwps.  This
    will cause 'has_single_step_breakpoints (thread)' to be true, so that
    the assert in maybe_hw_step...
    
          /* GDBserver must insert single-step breakpoint for software
             single step.  */
          gdb_assert (has_single_step_breakpoints (thread));
    
    ...will no longer fail.  And better still, the single-step breakpoints
    are reinstalled, so that stepping will actually work, even when
    interrupted.
    
    The C code for the test case was loosely adapted from the reproducer
    provided in Zhiyong's bug report for this problem.  The .exp file was
    copied from next-fork-other-thread.exp and then tweaked slightly.  As
    noted in a comment in next-fork-exec-other-thread.exp, I had to remove
    "on" from the loop for non-stop as it was failing on all architectures
    (including x86-64) that I tested.  I have a feeling that it ought to
    work, but this can be investigated separately and (re)enabled once it
    works.  I also increased the number of iterations for the loop running
    the "next" commands.  I've had some test runs which don't show the bug
    until the loop counter exceeded 100 iterations.  The C file for the
    new test uses shorter delays than next-fork-other-thread.c though, so
    it doesn't take overly long (IMO) to run this new test.
    
    Running the new test on a Raspberry Pi w/ a 32-bit (Arm) kernel and
    userland using a gdbserver build without the fix in this commit shows
    the following results:
    
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=12: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=9: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=18: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=3: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=11: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=fork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=1: next to break here
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=on: i=3: next to break here
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=auto: non-stop=off: displaced-stepping=off: i=1: next to break here
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=47: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=57: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=auto: i=1: next to break here
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=on: i=10: next to break here
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=off: non-stop=off: displaced-stepping=off: i=1: next to break here
    
                    === gdb Summary ===
    
     # of unexpected core files     12
     # of expected passes           3011
     # of unexpected failures       14
    
    Each of the 12 core files were caused by the failed assertion in
    maybe_hw_step in linux-low.c.  These correspond to 12 of the
    unexpected failures.
    
    When the tests are run using a gdbserver build which includes the fix
    in this commit, the results are significantly better, but not perfect:
    
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=auto: i=143: next to other line
    FAIL: gdb.threads/next-fork-exec-other-thread.exp: fork_func=vfork: target-non-stop=on: non-stop=off: displaced-stepping=on: i=25: next to other line
    
                    === gdb Summary ===
    
     # of expected passes           10178
     # of unexpected failures       2
    
    I think that the two remaining failures are due to some different
    problem.  They are also racy - I've seen runs with no failures or only
    one failure, but never more than two.  Also, those runs were conducted
    with the loop count in next-fork-exec-other-thread.exp set to 200.
    During his testing of this fix and the new test case, Luis Machado
    found that this test was taking a long time and asked about ways to
    speed it up.  I then conducted additional tests in which I gradually
    reduced the loop count, timing each one, also noting the number of
    failures.  With the loop count set to 30, I found that I could still
    reliably reproduce the failures that Zhiyong reported (in which, with
    the proper settings, core files are created).  But, with the loop
    count set to 30, the other failures noted above were much less likely
    to show up.  Anyone wishing to investigate those other failures should
    set the loop count back up to 200.
    
    Running the new test on x86-64 and aarch64, both native and
    native-gdbserver shows no failures.
    
    Also, I see no regressions when running the entire test suite for
    armv7l-unknown-linux-gnueabihf (i.e.  the Raspberry Pi w/ 32-bit
    kernel+userland) with --target_board=native-gdbserver.  Additionally,
    using --target_board=native-gdbserver, I also see no regressions for
    the entire test suite for x86-64 and aarch64 running Fedora 38.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30387
    Co-Authored-By: Zhiyong Yan <zhiyong.yan@windriver.com>
    Tested-By: Zhiyong Yan <zhiyong.yan@windriver.com>
    Tested-By: Luis Machado <luis.machado@arm.com>
Comment 26 Kevin Buettner 2023-08-12 04:02:25 UTC
It should be fixed now, so I'm closing this bug.