This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: alloca is bad?


On Sun, Nov 12, 2000 at 07:16:58AM -0500, Eli Zaretskii wrote:
>> Date: Sun, 12 Nov 2000 08:06:27 +0000
>> From: Fernando Nasser <fnasser@cygnus.com>
>> 
>> The problem is that with a corrupted SP and FP you have no idea of where
>> it happened.  Doesn't matter if the crash was immediately after the fact,
>> all evidence of when it happened is wiped away.
>
>??? The core file will usually tell you the function in which it
>crashed, and sometimes the one which called it (if you are lucky).
>GDB doesn't need the stack information to tell you what function
>crashed, the value of $pc should be enough.  At least usually.
>
>Or am I missing something?

I've seen cases where the $pc was 0 and the frame pointer was screwed
up, making a back trace impossible.  That requires rerunning the program
and judicious use of display and step.  Whenever I have a stack problem,
I "display" the memory location that is being corrupted, rerun the
program, and "next" around until I see it fail.  When I see the failure,
I step into the offending function.

I've debugged many cases of stack corruption over my disgustingly long
career as a programmer.  I can't remember a specific case that troubled
me for very long.  I can remember, at least two cases where I spent days
trying to track down heap corruption problems, however.

Wasn't it mentioned that heap corruption was "easier" to track down
because there were tools for doing so on this thread?  Doesn't the fact
that there has to be special tools for debugging heap problems provide
enough evidence that heap corruption problems are not trivially
debugged?

>> On the other hand, if you can get your execution path to repeat in a 
>> deterministic way so heap pieces are allocated in the same locations,
>> you can use hardware watchpoints (as it is memory, not registers).
>
>This is usually almost impossible without having sources to malloc.
>Without the sources, you won't know where to put the watchpoints: all
>you have is the garbled pointer in the register which caused the
>crash, and the crash is usually not related to what your program did
>at the point of the crash, so your own variables don't help.

Right.  Also, the "if you can get your execution path to repeat in a
determinstic way" argument applies equally well to the stack.  If a
variable is getting clobbered, you find out where it has been stored
on the stack and monitor that location until you see it change.  You
may step by a function that causes the corruption but then you rerun
the program and try again.

In the case of stack problems you don't have to wonder if a function
call made thousands of instructions earlier could have been the culprit
that is screwing you up now.  If you see stack corruption it will be in
one of the functions on the stack.

I know that it is likely that no one here has much experience with
threaded programming, but I would much rather debug stack corruption in
a threaded environment than I would heap corruption.  If you have a
situation where more than one thread can be doing a malloc, as is the
case with cygwin, then you've thrown the whole "deterministic" thing out
of the window as far as the heap is concerned.  In threaded
applications, you *try* to use the stack for local storage as much as
possible so that each thread can have its own relatively isolated data
storage.

Again, I have to point out that this whole "stack problems are wicked
hard to debug" argument is specious, anyway.  All of the problems that
you have with an alloca'ed array apply to local arrays and pointers to
local variables.  Fernando has implied that using addresses to auto
variables is poor programming.  IMO, that statement is provably
incorrect since many UNIX system functions require pointers in their
parameter lists.  I've never heard anyone claim that the UNIX API
was broken.

Also, as far as debuggability and maintainability is concerned, I have a
hard time believing that anyone could assert that this:

    struct cleanup *chain;
    char *foo;

    foo = malloc (size);
    chain = make_cleanup (free, (void *) foo);
    .
    .
    . many lines of code
    .
    .
    do_cleanups (chain)

was more apt to be error-free and understandable than this:

    char *foo;
    foo = alloca (size);

As an exercise, imagine what would happen if I had gone with my
first, incorrect, inclination and done this:

    chain = make_cleanup (free, (void *) &foo);

Nick Duffek has requested that the procedures for determining
programming guidelines in GDB be clarified.  The feeling that I get from
Andrew and Fernando is that this alloca discussion has raged on before
and that the issue had been decided time ago.  I've checked the mailing
list archives and I don't see any definitive statements about this
subject, however.  If there is a document that says "Don't Use alloca
and Here's Why" no one has pointed it out yet.

Since at least four GDB maintainers have expressed their desire to
use alloca, is that an adequate quorom?  What's the procedure here?  Are
we eventually going to come to a decision and document it?  Or, are we
going to keep wrangling until someone gets tired?

cgf

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]