Bug 30547 - [gdb, s390x, ppc64] segfault in for_each_block
Summary: [gdb, s390x, ppc64] segfault in for_each_block
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 13.1
: P2 normal
Target Milestone: 15.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-13 13:16 UTC by Tom de Vries
Modified: 2023-11-28 09:54 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2023-06-13 13:16:28 UTC
With a gdb 13.2 based package on s390x SLE-15, I run into:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1
continue^M
Continuing.^M
Reading symbols from /home/abuild/rpmbuild/BUILD/gdb-13.2/build-s390x-suse-linux/gdb/testsuite.unix.-m64.-fno-PIE.-no-pie/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent...^M
^M
^M
Fatal signal: Segmentation fault^M
----- Backtrace -----^M
0x2aa323927c1 gdb_internal_backtrace_1^M
        ../../gdb/bt-utils.c:122^M
0x2aa323927c1 _Z22gdb_internal_backtracev^M
        ../../gdb/bt-utils.c:168^M
0x2aa324cbb37 handle_fatal_signal^M
        ../../gdb/event-top.c:971^M
0x2aa324cbc13 handle_sigsegv^M
        ../../gdb/event-top.c:1044^M
0x3ffcc2fd74d ???^M
0x2aa32430ad0 for_each_block^M
        ../../gdb/dcache.c:199^M
0x2aa32430ad0 _Z17dcache_invalidateP13dcache_struct^M
        ../../gdb/dcache.c:251^M
0x2aa3256f019 _Z20fetch_inferior_eventv^M
        ../../gdb/infrun.c:4162^M
0x2aa32a23049 gdb_wait_for_event^M
        ../../gdbsupport/event-loop.cc:694^M
0x2aa32a239e1 _Z16gdb_do_one_eventi^M
        ../../gdbsupport/event-loop.cc:217^M
0x2aa325c0605 start_event_loop^M
        ../../gdb/main.c:411^M
0x2aa325c0605 captured_command_loop^M
        ../../gdb/main.c:471^M
0x2aa325c20b7 captured_main^M
        ../../gdb/main.c:1330^M
0x2aa325c20b7 _Z8gdb_mainP18captured_main_args^M
        ../../gdb/main.c:1345^M
0x2aa32293945 main^M
        ../../gdb/gdb.c:32^M
...
Comment 1 Tom de Vries 2023-10-31 08:02:13 UTC
Also ran into this with gdb-14-branch and centos-7 ppc64:
...
PASS: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: resolution_method=schedule-multiple: print unblock_parent = 1
ERROR: GDB process no longer exists
UNRESOLVED: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: resolution_method=schedule-multiple: continue to break_parent
  ...
PASS: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=off: non-stop=off: resolution_method=schedule-multiple: print unblock_parent = 1
ERROR: GDB process no longer exists
UNRESOLVED: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=off: non-stop=off: resolution_method=schedule-multiple: continue to break_parent
...

In more detail:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: resolution_method=schedule-multiple: print unblock_parent = 1
continue
Continuing.
Reading symbols from /home/vries/gdb/build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent-exit...


Fatal signal: Segmentation fault
----- Backtrace -----
0x1027d3e7 gdb_internal_backtrace_1
	/home/vries/gdb/src/gdb/bt-utils.c:122
0x1027d54f _Z22gdb_internal_backtracev
	/home/vries/gdb/src/gdb/bt-utils.c:168
0x1057643f handle_fatal_signal
	/home/vries/gdb/src/gdb/event-top.c:889
0x10576677 handle_sigsegv
	/home/vries/gdb/src/gdb/event-top.c:962
0x3fffad660477 ???
0x103f2144 for_each_block
	/home/vries/gdb/src/gdb/dcache.c:199
0x103f235b _Z17dcache_invalidateP13dcache_struct
	/home/vries/gdb/src/gdb/dcache.c:251
0x10bde8c7 _Z24target_dcache_invalidatev
	/home/vries/gdb/src/gdb/target-dcache.c:50
0x106a4f27 _Z20fetch_inferior_eventv
	/home/vries/gdb/src/gdb/infrun.c:4420
0x10670d63 _Z22inferior_event_handler19inferior_event_type
	/home/vries/gdb/src/gdb/inf-loop.c:42
0x1071a0c7 handle_target_event
	/home/vries/gdb/src/gdb/linux-nat.c:4243
0x1176159f handle_file_event
	/home/vries/gdb/src/gdbsupport/event-loop.cc:573
0x11761b77 gdb_wait_for_event
	/home/vries/gdb/src/gdbsupport/event-loop.cc:694
0x1176019f _Z16gdb_do_one_eventi
	/home/vries/gdb/src/gdbsupport/event-loop.cc:217
0x1078a3bf start_event_loop
	/home/vries/gdb/src/gdb/main.c:407
0x1078a647 captured_command_loop
	/home/vries/gdb/src/gdb/main.c:471
0x1078caff captured_main
	/home/vries/gdb/src/gdb/main.c:1324
0x1078cbf3 _Z8gdb_mainP18captured_main_args
	/home/vries/gdb/src/gdb/main.c:1343
0x10019b9f main
	/home/vries/gdb/src/gdb/gdb.c:39
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

ERROR: GDB process no longer exists
GDB process exited with wait status 56538 exp20 0 0 CHILDKILLED SIGSEGV {segmentation violation}
UNRESOLVED: gdb.base/vfork-follow-parent.exp: exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: resolution_method=schedule-multiple: continue to break_parent
...
Comment 2 Tom de Vries 2023-10-31 08:28:58 UTC
Reproduces (100% sofar, but not always at the same point) for me with:
...
gdb -q -batch -x ./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2
...

Adding debug prints in target-dcache.c:
...
  fprintf (stderr, "set (%p): %p\n", current_program_space->aspace, dcache);
  ...
  fprintf (stderr, "get (%p): %p\n", current_program_space->aspace, res);
...
gives us:
...
$ gdb -q -batch -x ./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2 
No breakpoints or watchpoints.
Breakpoint 1 at 0x100007e4: file /home/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.c, line 34.
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)
get (0x1000ae22f50): (nil)

get (0x1000ae22f50): (nil)
set (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
Breakpoint 1, main (argc=1, argv=0x3ffffffff388) at /home/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.c:34
34	  alarm (30);
No breakpoints or watchpoints.
Breakpoint 2 at 0x100007ac: file /home/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.c, line 29.
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
Can not resume the parent process over vfork in the foreground while
holding the child stopped.  Try "set detach-on-fork" or "set schedule-multiple".
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
0x00003fffb7d4f778 in .__vfork () from /lib64/libc.so.6
Can not resume the parent process over vfork in the foreground while
holding the child stopped.  Try "set detach-on-fork" or "set schedule-multiple".
0x00003fffb7d4f778 in .__vfork () from /lib64/libc.so.6
[New inferior 2 (process 62858)]
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
[Inferior 2 (process 62858) exited normally]
[Switching to inferior 1 [process 62856] (/home/vries/gdb/build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent-exit)]
[Switching to thread 1.1 (process 62856)]
#0  0x00003fffb7d4f778 in .__vfork () from /lib64/libc.so.6
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
$1 = 1
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
get (0x1000ae22f50): 0x1000b15fbe0
Reading symbols from /home/vries/gdb/build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent-exit...
get (0x1000ae22f50): 0x785f77616b650000


Fatal signal: Segmentation fault
...
Comment 3 Tom de Vries 2023-10-31 15:53:18 UTC
I set a watchpoint:
...
(gdb) p reg_obj
$3 = (registry<address_space> *) 0x12ee2e80
(gdb) p *reg_obj
$4 = {m_fields = std::vector of length 1, capacity 1 = {0x0}}
(gdb) what *reg_obj
type = registry<address_space>
(gdb) p *(registry<address_space> *) 0x12ee2e80
$5 = {m_fields = std::vector of length 1, capacity 1 = {0x0}}
(gdb) watch *(registry<address_space> *) 0x12ee2e80
Watchpoint 2: *(registry<address_space> *) 0x12ee2e80
...
and ran into:
...
Watchpoint 2: *(registry<address_space> *) 0x12ee2e80

Old value = {m_fields = std::vector of length 1, capacity 1 = {0x13322180}}
New value = {m_fields = std::vector of length 39665389, capacity 39665389 = {<error reading variable>
0x00003fffb738b4f0 in .__libc_free () from /lib64/libc.so.6
(gdb) bt
#0  0x00003fffb738b4f0 in .__libc_free () from /lib64/libc.so.6
#1  0x000000001176e760 in operator delete (p=0x12ee2e80) at /home/vries/gdb/src/gdbsupport/new-op.cc:109
#2  0x00000000108fffa8 in program_space::~program_space (this=0x13312100, __in_chrg=<optimized out>)
    at /home/vries/gdb/src/gdb/progspace.c:125
#3  0x000000001068e44c in delete_inferior (inf=0x13327290) at /home/vries/gdb/src/gdb/inferior.c:290
#4  0x000000001068ef6c in prune_inferiors () at /home/vries/gdb/src/gdb/inferior.c:480
#5  0x00000000106a72d4 in fetch_inferior_event () at /home/vries/gdb/src/gdb/infrun.c:4558
#6  0x0000000010672994 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/vries/gdb/src/gdb/inf-loop.c:42
#7  0x000000001071bef0 in handle_target_event (error=0, client_data=0x0) at /home/vries/gdb/src/gdb/linux-nat.c:4243
#8  0x0000000011764ec8 in handle_file_event (file_ptr=0x1311beb0, ready_mask=1)
    at /home/vries/gdb/src/gdbsupport/event-loop.cc:573
#9  0x00000000117654a0 in gdb_wait_for_event (block=0) at /home/vries/gdb/src/gdbsupport/event-loop.cc:694
#10 0x0000000011763ac8 in gdb_do_one_event (mstimeout=-1) at /home/vries/gdb/src/gdbsupport/event-loop.cc:217
#11 0x0000000010c5936c in wait_sync_command_done () at /home/vries/gdb/src/gdb/top.c:427
#12 0x0000000010c59470 in maybe_wait_sync_command_done (was_sync=0) at /home/vries/gdb/src/gdb/top.c:444
#13 0x0000000010c59c08 in execute_command (p=0x1329c830 "", from_tty=0) at /home/vries/gdb/src/gdb/top.c:577
#14 0x0000000010576c60 in command_handler (command=0x1329c828 "continue") at /home/vries/gdb/src/gdb/event-top.c:552
#15 0x0000000010c58f90 in read_command_file (stream=0x12ff05b0) at /home/vries/gdb/src/gdb/top.c:342
#16 0x0000000010323214 in script_from_file (stream=0x12ff05b0, 
    file=0x3ffffffff6a2 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2")
    at /home/vries/gdb/src/gdb/cli/cli-script.c:1642
#17 0x00000000102f99c8 in source_script_from_stream (stream=0x12ff05b0, 
    file=0x3ffffffff6a2 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2", 
    file_to_open=0x12f57e28 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2")
    at /home/vries/gdb/src/gdb/cli/cli-cmds.c:730
#18 0x00000000102f9b94 in source_script_with_search (
    file=0x3ffffffff6a2 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2", from_tty=0, 
    search_path=0) at /home/vries/gdb/src/gdb/cli/cli-cmds.c:775
---Type <return> to continue, or q <return> to quit---
#19 0x00000000102f9c7c in source_script (
    file=0x3ffffffff6a2 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2", from_tty=0)
    at /home/vries/gdb/src/gdb/cli/cli-cmds.c:784
#20 0x000000001078c7c4 in catch_command_errors (command=@0x12867548: 0x102f9c44 <source_script(char const*, int)>, 
    arg=0x3ffffffff6a2 "./build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/gdb.in.2", from_tty=0, 
    do_bp_actions=false) at /home/vries/gdb/src/gdb/main.c:513
#21 0x000000001078cadc in execute_cmdargs (cmdarg_vec=0x3fffffffec18, file_type=CMDARG_FILE, 
    cmd_type=CMDARG_COMMAND, ret=0x3fffffffec48) at /home/vries/gdb/src/gdb/main.c:610
#22 0x000000001078e848 in captured_main_1 (context=0x3fffffffeee0) at /home/vries/gdb/src/gdb/main.c:1293
#23 0x000000001078eb1c in captured_main (data=0x3fffffffeee0) at /home/vries/gdb/src/gdb/main.c:1314
#24 0x000000001078ec14 in gdb_main (args=0x3fffffffeee0) at /home/vries/gdb/src/gdb/main.c:1343
#25 0x000000001001a180 in main (argc=8, argv=0x3ffffffff358) at /home/vries/gdb/src/gdb/gdb.c:39
(gdb) 
...

So, AFAIU we have program_space::~program_space:
...
  if (!gdbarch_has_shared_address_space (target_gdbarch ()))
    delete this->aspace;
...
which calls the address space destructor, which deletes:
...
  /* Per aspace data-pointers required by other GDB modules.  */
  registry<address_space> registry_fields;
...
which invalidates:
...
static const registry<address_space>::key<DCACHE, dcache_deleter>
  target_dcache_aspace_key;
...
Comment 4 Tom de Vries 2023-11-01 09:42:19 UTC
Hardcoding linux_is_uclinux to false makes the test-case pass for me.  The function seems to be giving inconsistent results.

The scenario is as follows:
- a program space with an address space is created
- a second program space is about to be created. maybe_new_address_space
  is called, and because linux_is_uclinux returns true, maybe_new_address_space
  returns false, and no new address space is created
- a second program space with the same address space is created
- a program space is deleted. Because linux_is_uclinux now returns false,
  gdbarch_has_shared_address_space (current_inferior ()->arch ()) returns
  false, and the address space is deleted
- when gdb uses the address space of the remaining program space (which is
  now deleted), it runs into use after free issues.
Comment 5 Tom de Vries 2023-11-01 09:56:43 UTC
(In reply to Tom de Vries from comment #4)
> Hardcoding linux_is_uclinux to false makes the test-case pass for me.  The
> function seems to be giving inconsistent results.
> 
> The scenario is as follows:
> - a program space with an address space is created
> - a second program space is about to be created. maybe_new_address_space
>   is called, and because linux_is_uclinux returns true,
> maybe_new_address_space
>   returns false, and no new address space is created
> - a second program space with the same address space is created
> - a program space is deleted. Because linux_is_uclinux now returns false,
>   gdbarch_has_shared_address_space (current_inferior ()->arch ()) returns
>   false, and the address space is deleted
> - when gdb uses the address space of the remaining program space (which is
>   now deleted), it runs into use after free issues.

Related reading: here ( https://sourceware.org/pipermail/gdb-patches/2023-October/202928.html ) it's suggested to use refcounting to determine whether an address space is shared.
Comment 6 Tom de Vries 2023-11-02 10:49:07 UTC
Tentative patch:
...
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 4730d29..95f6e7d 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1105,13 +1105,21 @@ static void restart_threads (struct thread_info *event_thread,
             go ahead and create a new one for this exiting
             inferior.  */
 
+         struct address_space *aspace;
+         {
+           scoped_restore save_inferior_ptid
+             = make_scoped_restore (&inferior_ptid);
+           inferior_ptid = ptid_t (vfork_parent->pid);
+           aspace = maybe_new_address_space ();
+         }
+
          /* Switch to no-thread while running clone_program_space, so
             that clone_program_space doesn't want to read the
             selected frame of a dead process.  */
          scoped_restore_current_thread restore_thread;
          switch_to_no_thread ();
 
-         inf->pspace = new program_space (maybe_new_address_space ());
+         inf->pspace = new program_space (aspace);
          inf->aspace = inf->pspace->aspace;
          set_current_program_space (inf->pspace);
          inf->removable = true;
...

It switches to the vfork parent while calling maybe_new_address_space.

Otherwise, during maybe_new_address_space, ppc_linux_nat_target::auxv_parse calls ppc_linux_target_wordsize (tid), which returns 4 instead of 8 because tid == 0:
...
int
ppc_linux_target_wordsize (int tid)
{
  int wordsize = 4;

  /* Check for 64-bit inferior process.  This is the case when the host is                                            
     64-bit, and in addition the top bit of the MSR register is set.  */
#ifdef __powerpc64__
  long msr;

  errno = 0;
  msr = (long) ptrace (PTRACE_PEEKUSER, tid, PT_MSR * 8, 0);
  if (errno == 0 && ppc64_64bit_inferior_p (msr))
    wordsize = 8;
#endif

  return wordsize;
}
...
and the wordsize == 4 causes the auxv vector to be parsed incorrectly.

The tid is 0 because of the switch_to_no_thread in handle_vfork_child_exec_or_exit, but using the tid of the exited vfork child gets us errno == ESRCH as well.  So we use the tid of the vfork parent instead.

Note btw that ppc_linux_target_wordsize is very casual about errno != 0, that could be improved.
Comment 8 Sourceware Commits 2023-11-28 09:31:35 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f9582a22dba747ff0905f4c1a80d84f677eeb928

commit f9582a22dba747ff0905f4c1a80d84f677eeb928
Author: Tom de Vries <tdevries@suse.de>
Date:   Tue Nov 28 10:31:25 2023 +0100

    [gdb] Fix segfault in for_each_block, part 1
    
    When running test-case gdb.base/vfork-follow-parent.exp on powerpc64 (likewise
    on s390x), I run into:
    ...
    (gdb) PASS: gdb.base/vfork-follow-parent.exp: \
      exec_file=vfork-follow-parent-exit: target-non-stop=on: non-stop=off: \
      resolution_method=schedule-multiple: print unblock_parent = 1
    continue^M
    Continuing.^M
    Reading symbols from vfork-follow-parent-exit...^M
    ^M
    ^M
    Fatal signal: Segmentation fault^M
    ----- Backtrace -----^M
    0x1027d3e7 gdb_internal_backtrace_1^M
            src/gdb/bt-utils.c:122^M
    0x1027d54f _Z22gdb_internal_backtracev^M
            src/gdb/bt-utils.c:168^M
    0x1057643f handle_fatal_signal^M
            src/gdb/event-top.c:889^M
    0x10576677 handle_sigsegv^M
            src/gdb/event-top.c:962^M
    0x3fffa7610477 ???^M
    0x103f2144 for_each_block^M
            src/gdb/dcache.c:199^M
    0x103f235b _Z17dcache_invalidateP13dcache_struct^M
            src/gdb/dcache.c:251^M
    0x10bde8c7 _Z24target_dcache_invalidatev^M
            src/gdb/target-dcache.c:50^M
    ...
    or similar.
    
    The root cause for the segmentation fault is that linux_is_uclinux gives an
    incorrect result: it should always return false, given that we're running on a
    regular linux system, but instead it returns first true, then false.
    
    In more detail, the segmentation fault happens as follows:
    - a program space with an address space is created
    - a second program space is about to be created. maybe_new_address_space
      is called, and because linux_is_uclinux returns true, maybe_new_address_space
      returns false, and no new address space is created
    - a second program space with the same address space is created
    - a program space is deleted. Because linux_is_uclinux now returns false,
      gdbarch_has_shared_address_space (current_inferior ()->arch ()) returns
      false, and the address space is deleted
    - when gdb uses the address space of the remaining program space, we run into
      the segfault, because the address space is deleted.
    
    Hardcoding linux_is_uclinux to false makes the test-case pass.
    
    We leave addressing the root cause for the following commit in this series.
    
    For now, prevent the segmentation fault by making the address space a refcounted
    object.
    
    This was already suggested here [1]:
    ...
    A better solution might be to have the address spaces be reference counted
    ...
    
    Tested on top of trunk on x86_64-linux and ppc64le-linux.
    Tested on top of gdb-14-branch on ppc64-linux.
    
    Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
    
    PR gdb/30547
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
    
    [1] https://sourceware.org/pipermail/gdb-patches/2023-October/202928.html
Comment 9 Sourceware Commits 2023-11-28 09:31:41 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=14414227bfac8ef1803715b3b642f8ba0ab6fff8

commit 14414227bfac8ef1803715b3b642f8ba0ab6fff8
Author: Tom de Vries <tdevries@suse.de>
Date:   Tue Nov 28 10:31:25 2023 +0100

    [gdb] Fix segfault in for_each_block, part 2
    
    The previous commit describes PR gdb/30547, a segfault when running test-case
    gdb.base/vfork-follow-parent.exp on powerpc64 (likewise on s390x).
    
    The root cause for the segmentation fault is that linux_is_uclinux gives an
    incorrect result: it returns true instead of false.
    
    So, why does linux_is_uclinux:
    ...
    int
    linux_is_uclinux (void)
    {
      CORE_ADDR dummy;
    
      return (target_auxv_search (AT_NULL, &dummy) > 0
              && target_auxv_search (AT_PAGESZ, &dummy) == 0);
    ...
    return true?
    
    This is because ppc_linux_target_wordsize returns 4 instead of 8, causing
    ppc_linux_nat_target::auxv_parse to misinterpret the auxv vector.
    
    So, why does ppc_linux_target_wordsize:
    ...
    int
    ppc_linux_target_wordsize (int tid)
    {
      int wordsize = 4;
    
      /* Check for 64-bit inferior process.  This is the case when the host is
         64-bit, and in addition the top bit of the MSR register is set.  */
      long msr;
    
      errno = 0;
      msr = (long) ptrace (PTRACE_PEEKUSER, tid, PT_MSR * 8, 0);
      if (errno == 0 && ppc64_64bit_inferior_p (msr))
        wordsize = 8;
    
      return wordsize;
    }
    ...
    return 4?
    
    Specifically, we get this result because because tid == 0, so we get
    errno == ESRCH.
    
    The tid == 0 is caused by the switch_to_no_thread in
    handle_vfork_child_exec_or_exit:
    ...
              /* Switch to no-thread while running clone_program_space, so
                 that clone_program_space doesn't want to read the
                 selected frame of a dead process.  */
              scoped_restore_current_thread restore_thread;
              switch_to_no_thread ();
    
              inf->pspace = new program_space (maybe_new_address_space ());
    ...
    but moving the maybe_new_address_space call to before that gives us the
    same result.  The tid is no longer 0, but we still get ESRCH because the
    thread has exited.
    
    Fix this in handle_vfork_child_exec_or_exit by doing the
    maybe_new_address_space call in the context of the vfork parent.
    
    Tested on top of trunk on x86_64-linux and ppc64le-linux.
    Tested on top of gdb-14-branch on ppc64-linux.
    
    Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>
    
    PR gdb/30547
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30547
Comment 10 Tom de Vries 2023-11-28 09:54:05 UTC
Fixed.