Bug 9747 - Quit and "(running)" bug
Summary: Quit and "(running)" bug
Status: ASSIGNED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: unknown
: P1 critical
Target Milestone: 7.0
Assignee: Pedro Alves
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-15 16:00 UTC by Pierre Muller
Modified: 2014-03-22 00:09 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
remove_is_executing.diff (4.96 KB, patch)
2009-01-22 15:54 UTC, Pedro Alves
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pierre Muller 2009-01-15 16:00:28 UTC
I have troubles with CVS HEAD gdb on cygwin, related to the "(running)" state.
But I don't think that this problem is windows specific...


  Easiest way to reproduce these problems is to:
Run gdb with itself:
./gdb ./gdb
$ ./gdb ./gdb
GNU gdb (GDB) 6.8.50.20090115-cvs
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-cygwin".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Setting up the environment for debugging gdb.
During symbol reading, struct/union type gets multiply defined: struct type.
Breakpoint 1 at 0x40b8a3: file ../../purecvs/gdb/utils.c, line 972.
Breakpoint 2 at 0x419086: file ../../purecvs/gdb/cli/cli-cmds.c, line 199.
(top-gdb) start
Temporary breakpoint 3 at 0x40105c: file ../../purecvs/gdb/gdb.c, line 26.
Starting program: /usr/local/src/gdbcvs/build-bare/gdb/gdb.exe
[New Thread 3768.0xd98]
[New Thread 3768.0xb0]

Temporary breakpoint 3, main (argc=1, argv=0xf01f58)
    at ../../purecvs/gdb/gdb.c:26
26      {
(top-gdb) set height 1
(top-gdb) n
---Type <return> to continue, or q <return> to quit---q Quit
(top-gdb) set height 80
(top-gdb) inf thr
  2 Thread 3768.0xb0  (running)
* 1 Thread 3768.0xd98  (running)
(top-gdb) cont
Continuing.
Cannot execute this command while the selected thread is running.
(top-gdb)

  The problem is the set_running function introduced for the non-stop mode probably:
the fact on Quitting at the --Type <return> question bypasses the
 set_running(..,0)
(I discovered that by adding an printout on each set_running call) and thus
leaves gdb beleaaving that the threads are running while non-stop mode is not
even implemented yet on cygwin native gdb!

  This needs a fix!



Pierre Muller
Pascal language support maintainer for GDB
Comment 1 Pedro Alves 2009-01-15 16:05:56 UTC
Working on it ...
Comment 2 cvs-commit@gcc.gnu.org 2009-01-18 17:42:36 UTC
Subject: Bug 9747

CVSROOT:	/cvs/src
Module name:	src
Changes by:	palves@sourceware.org	2009-01-18 17:42:17

Modified files:
	gdb            : ChangeLog gdbthread.h infcmd.c infrun.c 
	                 thread.c 

Log message:
	PR gdb/9747:
	* gdbthread.h (finish_thread_state, finish_thread_state_cleanup):
	Declare.
	* thread.c (finish_thread_state, finish_thread_state_cleanup): New.
	* infrun.c (wait_for_inferior, fetch_inferior_event): If an error
	is thrown while handling an event, finish the thread state.
	(normal_stop): Use finish_thread_state cleanup.
	* infcmd.c (run_command_1): If an error is thrown while starting
	the inferior, finish the thread state.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/ChangeLog.diff?cvsroot=src&r1=1.10127&r2=1.10128
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/gdbthread.h.diff?cvsroot=src&r1=1.44&r2=1.45
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/infcmd.c.diff?cvsroot=src&r1=1.228&r2=1.229
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/infrun.c.diff?cvsroot=src&r1=1.351&r2=1.352
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/thread.c.diff?cvsroot=src&r1=1.100&r2=1.101

Comment 3 Pedro Alves 2009-01-18 17:58:17 UTC
Fix checked in.
Comment 4 Luis Machado 2009-01-20 12:34:41 UTC
This is still broken for my situation.

Staticthreads.exp throws a "find_new_threads_callback: cannot get thread info:
generic error" error on ppc and after that we're stuck in the "thread running"
state.

The backtrace for when that error message is thrown follows:

#0  find_new_threads_callback (th_p=0xfffff85b4d0, data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/linux-thread-db.c:986
#1  0x00000400003f6d48 in .iterate_thread_list () from
/lib64/ppc970/libthread_db.so.1
#2  0x00000400003f6e44 in .td_ta_thr_iter () from /lib64/ppc970/libthread_db.so.1
#3  0x00000000100c834c in thread_db_find_new_threads () at
/home/luis/src/gdb/HEAD/src/gdb/linux-thread-db.c:1040
#4  0x00000000100c7224 in check_for_thread_db () at
/home/luis/src/gdb/HEAD/src/gdb/linux-thread-db.c:667
#5  0x00000000100cb450 in linux_child_post_startup_inferior (ptid={pid = 15214,
lwp = 0, tid = 0}) at /home/luis/src/gdb/HEAD/src/gdb/linux-nat.c:685
#6  0x00000000102df22c in inf_ptrace_create_inferior (ops=0x109ceca0,
exec_file=0x10a1ba40
"/home/luis/builds/gdb/HEAD-now/gdb/testsuite/gdb.threads/staticthreads",
    allargs=0x10a5d2e0 "", env=0x109ef010, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/inf-ptrace.c:164
#7  0x00000000100ccf38 in linux_nat_create_inferior (ops=0x109ceca0,
exec_file=0x10a1ba40
"/home/luis/builds/gdb/HEAD-now/gdb/testsuite/gdb.threads/staticthreads",
    allargs=0x10a5d2e0 "", env=0x109ef010, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/linux-nat.c:1378
#8  0x00000000101fde28 in find_default_create_inferior (ops=0x1092d468,
exec_file=0x10a1ba40
"/home/luis/builds/gdb/HEAD-now/gdb/testsuite/gdb.threads/staticthreads",
    allargs=0x10a5d2e0 "", env=0x109ef010, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/target.c:2171
#9  0x00000000101f88c8 in target_create_inferior (exec_file=0x10a1ba40
"/home/luis/builds/gdb/HEAD-now/gdb/testsuite/gdb.threads/staticthreads",
args=0x10a5d2e0 "",
    env=0x109ef010, from_tty=1) at /home/luis/src/gdb/HEAD/src/gdb/target.c:292
#10 0x00000000101aa9c4 in run_command_1 (args=0x0, from_tty=1, tbreak_at_main=1)
at /home/luis/src/gdb/HEAD/src/gdb/infcmd.c:546
#11 0x00000000101aabf8 in start_command (args=0x0, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/infcmd.c:600
#12 0x0000000010108acc in do_cfunc (c=0x109ee890, args=0x0, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/cli/cli-decode.c:67
#13 0x000000001010cc90 in cmd_func (cmd=0x109ee890, args=0x0, from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/cli/cli-decode.c:1732
During symbol reading, DW_AT_type missing from DW_TAG_subrange_type.
#14 0x0000000010072d94 in execute_command (p=0x10931ed5 "", from_tty=1) at
/home/luis/src/gdb/HEAD/src/gdb/top.c:449
#15 0x00000000101d58e0 in command_handler (command=0x10931ed0 "") at
/home/luis/src/gdb/HEAD/src/gdb/event-top.c:514
#16 0x00000000101d62cc in command_line_handler (rl=0x109316d0 "") at
/home/luis/src/gdb/HEAD/src/gdb/event-top.c:739
#17 0x00000000106307a0 in rl_callback_read_char () at
/home/luis/src/gdb/HEAD/src/readline/callback.c:205
#18 0x00000000101d4808 in rl_callback_read_char_wrapper (client_data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/event-top.c:178
#19 0x00000000101d5664 in stdin_event_handler (error=0, client_data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/event-top.c:433
#20 0x00000000101d331c in handle_file_event (data={ptr = 0x1, integer = 0}) at
/home/luis/src/gdb/HEAD/src/gdb/event-loop.c:812
#21 0x00000000101d2340 in process_event () at
/home/luis/src/gdb/HEAD/src/gdb/event-loop.c:394
#22 0x00000000101d24c8 in gdb_do_one_event (data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/event-loop.c:459
#23 0x00000000101c9cfc in catch_errors (func=@0x108c9260: 0x101d2384
<gdb_do_one_event>, func_args=0x0, errstring=0x10778db0 "", mask=6)
    at /home/luis/src/gdb/HEAD/src/gdb/exceptions.c:516
#24 0x00000000101271cc in tui_command_loop (data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/tui/tui-interp.c:153
#25 0x00000000101ca7a8 in current_interp_command_loop () at
/home/luis/src/gdb/HEAD/src/gdb/interps.c:290
#26 0x000000001006616c in captured_command_loop (data=0x0) at
/home/luis/src/gdb/HEAD/src/gdb/main.c:99
#27 0x00000000101c9cfc in catch_errors (func=@0x108b41e8: 0x10066148
<captured_command_loop>, func_args=0x0, errstring=0x1075dc48 "", mask=6)
    at /home/luis/src/gdb/HEAD/src/gdb/exceptions.c:516
#28 0x0000000010067aa4 in captured_main (data=0xfffff85cb20) at
/home/luis/src/gdb/HEAD/src/gdb/main.c:837
#29 0x00000000101c9cfc in catch_errors (func=@0x108b4200: 0x100661d8
<captured_main>, func_args=0xfffff85cb20, errstring=0x1075dc48 "", mask=6)
    at /home/luis/src/gdb/HEAD/src/gdb/exceptions.c:516
#30 0x0000000010067b00 in gdb_main (args=0xfffff85cb20) at
/home/luis/src/gdb/HEAD/src/gdb/main.c:846
#31 0x0000000010066118 in main (argc=1, argv=0xfffff85cfd8) at
/home/luis/src/gdb/HEAD/src/gdb/gdb.c:33


Reopening..
Comment 5 Pedro Alves 2009-01-20 12:57:09 UTC
Thanks.  Hmm, I hadn't placed the cleanup around target_create_inferior,
because, run_command_1 doesn't know about the inferior's ptid yet at that point,
but, it will be in inferior_ptid, so I think this particular case can be fixed
easily.

However, a thought has crossed my mind.  I'm considering removing the
is_executing property from the core, and pushing it down to the target.  Of the
non-stop and/or async aware targets, linux-nat.h/c already manages the
lwp->stopped flag; remote.c doesn't have anything of the sort yet though.  Other
targets could also default to false, I think.

I'm thinking about the case where an exception is thrown from target_wait (it is
documented as an invalid thing to do, but, targets do that.).  In that case,
considering the linux target, we can end up with is_executing == true,
lwp->stopped = 1, which is out of sync.

We could also go the other way around, get rid of lwp->stopped, in
favour of is_executing.  That is, make all targets use the core facility.

This getting stuck bug happens always when the inferior was told to resume, then
it stops in some internal event, then an exception is thrown which brings us to
the CLI again.  The core side will believe that the thread is
still running.  A case of information being stored in several places, which can
get out of sync.

I need to:

 1) see if there's a better place to put an exception catcher that drops to the cli
 2) see if I can do something to reduce information duplication.

The hurdle is async mode.  There is no good central place for #1 right now that
I can think of.  There aren't that many places that resume the target, though.
Comment 6 Pedro Alves 2009-01-22 15:54:19 UTC
Subject: Re:  Quit and "(running)" bug

Luis asked for this on #IRC, so here it is.

This is a prototype patch for the case of removing is_executing.

It isn't complete yet, but it handles Luis' case (that's the infcmd.c change).
remote.c isn't taking care of setting the new "stopped" flag yet, and, there's
at least one reference to is_executing in linux-nat.c that will have
to be rewritten.

Comments surely welcome.

Comment 7 Pedro Alves 2009-01-22 15:54:21 UTC
Created attachment 3679 [details]
remove_is_executing.diff
Comment 8 Luis Machado 2009-01-26 13:05:37 UTC
This works for me. Can't reproduce the failure anymore, though i tried in
different ways. It seems safer now.

Thanks Pedro.