Bug 10046 - internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
Summary: internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: threads (show other bugs)
Version: 6.8
: P2 normal
Target Milestone: 7.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-08 18:49 UTC by GNUtoo
Modified: 2010-03-23 20:23 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description GNUtoo 2009-04-08 18:49:31 UTC
hi,
I've the following problem when trying to debug wesnoth:
[New LWP 3306]
infrun: TARGET_WAITKIND_STOPPED
infrun: prepare_to_wait
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x4000e56c
infrun: context switch
infrun: Switching context from LWP 3305 to LWP 3306
infrun: BPSTAT_WHAT_CHECK_SHLIBS
infrun: no stepping, continue
infrun: resume (step=1, signal=0)
infrun: prepare_to_wait
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40011eec
infrun: software single step trap for LWP 3306
infrun: no stepping, continue
infrun: resume (step=0, signal=0)
infrun: prepare_to_wait
[LWP 3306 exited]
[LWP 3305 exited]
infrun: infwait_normal_state
[New LWP 3297]
infrun: TARGET_WAITKIND_STOPPED
/home/embedded/oetmp_openmoko/work/armv4t-angstrom-linux-gnueabi/gdb-6.8-r3/gdb-6.8/gdb/linux-nat.c:1152:
internal-error: linux_nat_resume: Assertion `lp != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

the strange thing is that I didnt play with gdb...I just ran wesnoth with:
gdb wesnoth
run -f -r 480x640
and I've waited until it crashes...I didn't add breakpoints etc...like in this
bugreport:
http://sourceware.org/ml/gdb/2008-08/msg00163.html
Comment 1 GNUtoo 2009-04-08 18:54:17 UTC
mmm...GNU gdb 6.7.1 works(pass wesnoth's tutorial fine) on my desktop computer...

Comment 2 Pedro Alves 2009-04-08 19:12:05 UTC
Subject: Re:  New: internal-error: linux_nat_resume: Assertion `lp != NULL' failed.

On Wednesday 08 April 2009 19:49:33, GNUtoo at no-log dot org wrote:
> hi,
> I've the following problem when trying to debug wesnoth:
> [New LWP 3306]

Looks like either wesnoth is using `clone' directly
instead of pthreads, or, you've got a libthread-db issue.  If the
latter, fixing up your pthreads setup will fix the issue.  See
the last points in the FAQ here: http://sourceware.org/gdb/wiki/FAQ .

> infrun: TARGET_WAITKIND_STOPPED
> infrun: prepare_to_wait

This was the new_thread_event path in
handle_inferior_event immediately resuming the
inferior.  LWP 3306 had hit a breakpoint, ...

> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x4000e56c
> infrun: context switch
> infrun: Switching context from LWP 3305 to LWP 3306
> infrun: BPSTAT_WHAT_CHECK_SHLIBS

Then it hits the breakpoint again, this time, we'll report
it.  We switched context to LWP 3306.  It was a shlib-event
breakpoint, an internal breakpoint.  It means LWP 3306 caused a
shared library load.  GDB sets a breakpoint at a magical place
to be noticied of such events, so that's your breakpoint.

> infrun: no stepping, continue
> infrun: resume (step=1, signal=0)
> infrun: prepare_to_wait
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x40011eec
> infrun: software single step trap for LWP 3306
> infrun: no stepping, continue
> infrun: resume (step=0, signal=0)
> infrun: prepare_to_wait
> [LWP 3306 exited]

Eventually, LWP 3306 exits.  

> [LWP 3305 exited]
> infrun: infwait_normal_state
> [New LWP 3297]
> infrun: TARGET_WAITKIND_STOPPED

Another LWP reports a breakpoint hit.  Again, we enter the new_thread_event path
in handle_inferior_event, which does this:

 if (ecs->new_thread_event)
    {
(...)
      target_resume (RESUME_ALL, 0, TARGET_SIGNAL_0);
      prepare_to_wait (ecs);
      return;
    }

Remember that inferior_ptid is still pointing at LWP 3306, an LWP
that has exited already.  RESUME_ALL is minus_one_ptid.

> /home/embedded/oetmp_openmoko/work/armv4t-angstrom-linux-gnueabi/gdb-6.8-r3/gdb-6.8/gdb/linux-nat.c:1152:
> internal-error: linux_nat_resume: Assertion `lp != NULL' failed.

So, linux_nat_resume asserts, because it does:

static void
linux_nat_resume (ptid_t ptid, int step, enum target_signal signo)
{
  /* If PID is -1, it's the current inferior that should be
     handled specially.  */
  if (PIDGET (ptid) == -1)  
    ptid = inferior_ptid;         <<<<<<< here, ptid is LWP 3306.

  lp = find_lwp_pid (ptid);
  gdb_assert (lp != NULL);  <<<<<< right, LWP 3306 is gone by now...

> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n)
> 
> the strange thing is that I didnt play with gdb...I just ran wesnoth with:
> gdb wesnoth
> run -f -r 480x640
> and I've waited until it crashes...I didn't add breakpoints etc...like in this
> bugreport:
> http://sourceware.org/ml/gdb/2008-08/msg00163.html

Right, GDB added them for you :-)

Probably the fix is to make new_thread_event context-switch to the
new thread before resuming.  It also beats me why new_thread_event
needs to resume the thread, thus making the inferior hit the same
breakpoint (or any other signal) twice.

Comment 3 Pedro Alves 2009-04-08 19:16:12 UTC
Subject: Re:  internal-error: linux_nat_resume: Assertion `lp != NULL' failed.

On Wednesday 08 April 2009 19:54:18, GNUtoo at no-log dot org wrote:

> ------- Additional Comments From GNUtoo at no-log dot org  2009-04-08 18:54 -------
> mmm...GNU gdb 6.7.1 works(pass wesnoth's tutorial fine) on my desktop computer...

From your paste on IRC:

$ gdb wesnoth
GNU gdb 6.7.1
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) run -w 
Starting program: /usr/games/bin/wesnoth -w
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 0xb74606e0 (LWP 27905)]
[New Thread 0xb745fb90 (LWP 27908)]
Battle for Wesnoth v1.6
Started on Wed Apr  8 20:52:21 2009

So on your i686 machine, thread debugging was enabled, while on
native arm GDB it wasn't.  If you fix that, you won't trip on the
GDB bug on arm.

Comment 4 Min Chen 2010-03-23 07:15:49 UTC
I think you used a stripped "libpthread" library.

I met the same problem as yours on a mips-based embedded system.
I tried to use an unstripped "libpthread" library, and the problem gone.
Comment 5 Pedro Alves 2010-03-23 20:23:06 UTC
On Tuesday 23 March 2010 07:15:50, chenmin83 at msn dot com wrote:

> I think you used a stripped "libpthread" library.

Yes, it must have been something like that, as discussed in comment #2.

In any case, the crash in question was fixed in 7.0, the way suggested
in comment #2:

2009-06-29  Pedro Alves  <pedro@codesourcery.com>

        * infrun.c (handle_inferior_event): Context switch to the new
        thread when resuming for a new_thread_event.