Bug 17970

Summary:	cannot call any function or method with "print" or "call" after command "handle SIGSEGV noprint pass"
Product:	gdb	Reporter:	Claude
Component:	c++	Assignee:	Not yet assigned to anyone <unassigned>
Status:	RESOLVED DUPLICATE
Severity:	normal	CC:	arigo, Claude, dje, keiths, pedro
Priority:	P2
Version:	7.8
Target Milestone:	---
Host:		Target:
Build:		Last reconfirmed:
Attachments:	gdb session log

Description Claude 2015-02-13 10:30:06 UTC

Steps to reproduce

foo.cpp contains:

#include <iostream>

int main()
{
	const char foo[] = "foo";
	std::cout << "foo=" << foo << std::endl;
}


Then:

$ g++ -D_DEBUG -g -o foo foo.cpp
$ gdb --args ./foo
(gdb) b foo.cpp:6
(gdb) r
Starting program: /tmp/foo 

Breakpoint 1, main () at foo.cpp:6
6		std::cout << "foo=" << foo << std::endl;
(gdb) p strlen(foo)

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
The program being debugged exited while in a function called from GDB.
Evaluation of the expression containing the function
(strlen) will be abandoned.

This occurs for *any* function or method call.

$ gdb --version
GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs


$ uname -a
Linux pegasus 3.16.0-30-generic #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Comment 1 Keith Seitz 2015-02-13 17:32:34 UTC

I cannot reproduce this on Fedora 21 with HEAD, the official FSF 7.8 release, or the official FSF 7.8.2 release.

So I installed an ubuntu VM on my machine. Running Ubuntu 14.04 LTS x86_64 (3.13-0-generic #57 Ubuntu SMP) and the default gdb install, GNU gdb (Ubuntu 7.7-0ubuntu-3.1) 7.7, and that also works.

Is there an updated package from available to you? Since you've marked this bug "critical," I would recommend you download, build, and install an official FSF release. If that exhibits the problem, then at least we have a starting point.

Comment 2 Claude 2015-02-13 18:09:04 UTC

Thanks for the effort.

The package comes from the latest ubuntu release, 14.10 (14.04 is only the latest long term support release).

It's the second system on which I encounter this bug. Nothing exotic in their configuration.

I'm going to make some tests also within a VM, like installing progressively the same set of packets to try to reproduce it. If that fails, I'll try with an official FSF release.

Comment 3 Keith Seitz 2015-02-13 18:59:49 UTC

(In reply to Claude from comment #2)
> The package comes from the latest ubuntu release, 14.10 (14.04 is only the
> latest long term support release).

I installed a 14.10 image in a virtual machine. The gdb that came with that (as a default -- I installed nothing other than the image from Ubuntu) is 7.8-1ubuntu4 7.8.0-20141001. That also did not crash on me:

$ g++ -D_DEBUG -g -o 17970 17970.cc
$ gdb -nx -q 17970
Reading symbols from 17970...done.
(gdb) start
Temporary breakpoint 1 at 0x4008ce: file 17970.cc, line 5.
Starting program: /home/ubuntu/17970

Temporary breakpoint 1, main () at 17970.cc:5
5     {
(gdb) n
6       char foo[] = "hello";
(gdb) n
8       std::cout << foo << std::endl;
(gdb) p strlen (foo)
$1 = 5
(gdb)

It must be something peculiar to your environment. A stack backtrace might offer some insight into where things are going wrong.

Comment 4 Claude 2015-02-13 19:15:18 UTC

I can print variables, and look at the backtrace.

(gdb) p foo
$1 = "foo"
(gdb) bt
#0  main () at foo.cpp:6


The problem only arises (but each time) when there is a function or method call in the gdb p or c command.

Yes, there must be something special with my environment, I'm investigating.

Is there any other diagnostic command I could launch in gdb?

Comment 5 Keith Seitz 2015-02-13 19:30:57 UTC

On 02/13/2015 11:15 AM, Claude at renegat dot net wrote:
> The problem only arises (but each time) when there is a function or method call
> in the gdb p or c command.

Sorry -- that was my bad. I was retyping the session. I did use "p 
strlen (foo)", and gdb returned "$1 = 5".

> Is there any other diagnostic command I could launch in gdb?

As long as I've worked on/with gdb, even I had to ask!

Try:
(gdb) set debug infrun 1
(gdb) set debug lin-lwp 1

Another co-worker asked to double-check your system security settings. 
He seemed to recall that Debian had "some SELinux-like security 
features" which could prevent function calls from happening. [I don't 
know anything about Ubuntu/Debain -- I use them only for testing.]

In any case, hopefully the debug statements will help shed some light on 
the situation.

Comment 6 Claude 2015-02-14 10:09:11 UTC

I found the cause. Still a bug, though. But I lowered the importance, and changed the subject.

It happened because my ~/.gdbinit file contained:

handle SIGSEGV noprint pass

And it also happens when the command is entered manually.

FYI:

handle SIGSEGV noprint stop => OK
handle SIGSEGV print pass   => OK
handle SIGSEGV print stop   => OK
handle SIGSEGV noprint pass => BUG

Comment 7 Armin Rigo 2015-06-12 08:30:57 UTC

I can confirm this.  I'm trying to debug a program which has its own segfault handler.  The following workaround works for me:

    handle SIGSEGV nostop pass    (i.e. without "noprint")
    set pagination off      (to prevent pauses at every page)

Comment 8 Pedro Alves 2015-06-12 10:18:59 UTC

That's how it's documented, actually.

The manual says (though that can always be improved):

~~~
 stop
 GDB should stop your program when this signal happens.  This implies
 print keyword as well.

 noprint
 GDB should not mention the occurrence of the signal at all.  This
 implies the nostop keyword as well.
~~~

And you can see that in the "handle" command's output:

 (gdb) handle SIGSEGV
 Signal        Stop      Print   Pass to program Description
 SIGSEGV       Yes       Yes     Yes             Segmentation fault
               ^^^^
 (gdb) handle SIGSEGV noprint
 Signal        Stop      Print   Pass to program Description
 SIGSEGV       No        No      Yes             Segmentation fault
               ^^^^

In here we see that "stop" overrides the "noprint":

 (gdb) handle SIGSEGV noprint stop
 Signal        Stop      Print   Pass to program Description
 SIGSEGV       Yes       Yes     Yes             Segmentation fault
               ^^^^      ^^^^

So AFAIK, there's no way to end up with:

 Signal        Stop      Print   Pass to program Description
 SIGSEGV       Yes       No      Yes             Segmentation fault

I think the intention here is that having GDB silently stop for a signal
would lead to a good deal of head scratching.  More so even if the signal is set to pass.

There's another way to get that behavior though, using "catch signal" along with "silent" instead:

 (gdb) catch signal SIGSEGV
 Catchpoint 1 (signal SIGSEGV)
 (gdb) commands 
 >silent
 >end
 (gdb) r
 Starting program: ...
 (gdb) 

(The last prompt appeared because of a SIGSEGV: the program stopped silently)

Though I wonder if you _really_ wanted that behaviour in the first place...

Comment 9 Armin Rigo 2015-06-12 19:37:37 UTC

Just in case it is not clear, this bug should be about:

"handle SIGSEGV nostop noprint pass" causes any call to any function to crash.

At any point, even after you got such a crash, you can solve the problem (and retry the same call) after typing another "handle SIGSEGV" command that includes "print".

Comment 10 Pedro Alves 2015-06-13 10:33:10 UTC

What do you mean by "causes any call to any function to crash." ?  If you _don't_ do the "handle SIGSEGV nostop", then I expect that the call crashes too, but with the difference that GDB stops before the inferior has a chance to handle the SIGSEGV.  I'm not seeing a bug there.

Comment 11 Pedro Alves 2015-06-13 11:00:49 UTC

If, when you _don't_ do any "handle" at all, the function call succeeds without a SIGSEGV, then please show "set debug infrun 1 + set debug lin-lwp 1" logs, in both cases of "handle" and no "handle" commands issued.  Nowadays GDB puts the dummy breakpoint on the stack, and given that stack memory is not supposed to be executable permissions, it results in a SIGSEGV that GDB internally translates to a SIGSEGV.  E.g., here what I see on F20:

 (gdb) p malloc (0)
 ...
 LLW: waitpid 29483 received Segmentation fault (stopped)
 ...
 infrun: target_wait (-1, status) =
 infrun:   29483 [Thread 0x7ffff7fc2740 (LWP 29483)],
 infrun:   status->kind = stopped, signal = GDB_SIGNAL_SEGV
 infrun: Treating signal as SIGTRAP
 ...
 $1 = (void *) 0x602010
 (gdb) si
 ...
 LLR: PTRACE_SINGLESTEP process 29483, 0 (resume event thread)
 ...
 37          args[i] = 1; /* Init value.  */
 (gdb) 

That 0 in PTRACE_SINGLESTEP line means that GDB suppressed the signal.

Comment 12 Pedro Alves 2015-06-13 11:04:34 UTC

I mean, it results in a SIGSEGV that GDB internally translates to a SIGTRAP.

Comment 13 Armin Rigo 2015-06-14 12:41:50 UTC

Created attachment 8363 [details]
gdb session log

Running ``gdb any-program``:

(gdb) start
Temporary breakpoint 1 at 0x400505: file test432.c, line 9.
Starting program: /home/arigo/c/a.out 

Temporary breakpoint 1, main () at test432.c:9
9           (void)f();
(gdb) p malloc(0)
$1 = (void *) 0x602010
(gdb) handle SIGSEGV noprint
Signal        Stop      Print   Pass to program Description
SIGSEGV       No        No      Yes             Segmentation fault
(gdb) p malloc(0)

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
The program being debugged exited while in a function called from GDB.
Evaluation of the expression containing the function
(malloc) will be abandoned.
(gdb) 

A complete session log of doing the same thing with two additional commands before "start", namely "set debug infrun 1" and "set debug lin-lwp 1", is attached.

Comment 14 Pedro Alves 2015-06-15 08:56:05 UTC

Thanks.

It's clear from the logs now:

 LLW: waitpid 19430 received Segmentation fault (stopped)
 LLW: PTRACE_CONT process 19430, Segmentation fault (preempt 'handle')
 LNW: waitpid(-1, ...) returned 0, No child processes

This has already been fixed in master (soon to be 7.10), with:

~~~
commit c9587f88230e9df836f17c195181aaf50c3a1117
Author: Antoine Tremblay <antoine.tremblay@ericsson.com>
Date:   Thu Feb 12 14:55:08 2015 -0500

    Fix non executable stack handling when calling functions in the inferior.
    
    When gdb creates a dummy frame to execute a function in the inferior,
    the process may generate a SIGSEGV, SIGTRAP or SIGILL because the stack
    is non executable. If the signal handler set in gdb has option print
    or stop enabled for these signals gdb handles this correctly.
    
    However, in the case of noprint and nostop the signal is short-circuited
    and the inferior process is sent the signal directly. This causes the
    inferior to crash because of gdb.

(...)
~~~

*** This bug has been marked as a duplicate of bug 16812 ***

Comment 15 dje 2015-06-15 21:30:27 UTC

Filing for reference sake.
Re comment #8: There is some non-orthogonality to the options that has tripped me up from time to time. I'd have to research whether the current example is what I have tripped on, but IMNSHO gdb should be less "clever" and instead should give the user the expressiveness that is possible (IOW don't assume a particular combination of option choices would never be useful).  gdb can always give a warning if a combination might be confusing to newbies.

Comment 16 Pedro Alves 2015-06-16 08:58:47 UTC

I agree.