This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Wishlist: declarations suitable for post mortem debugging


We have attributes like

/* This prints an "Assertion failed" message and aborts.  */
extern void __assert_fail (__const char *__assertion, __const char *__file,
                           unsigned int __line, __const char *__function)
     __THROW __attribute__ ((__noreturn__));

in assert.h and

extern void abort (void) __THROW __attribute__ ((__noreturn__));

in stdlib.h.  These functions, in contrast to exit, have a side effect
of dumping core as a regular effect of their execution.  The purpose is
to enable post-mortem debugging.

The attribute __noreturn__ directly conflicts with that purpose since it
tells the compiler it may trash the stack when calling the function, not
requiring any useful information to be retained on the stack.  In
particular, those functions may be _jumped_ to instead of called, or an
existing call to these functions in an unrelated part of source may get
recycled by jumping to it.

As a result, backtraces from the core dump are quite unreliable.  I
have, on several occasions, spent days of futile debugging on backtraces
that did not correspond with reality.

So I would strongly suggest that functions that are _explicitly_
intended to dump core don't get marked as "__noreturn__".  This seems
like a rather straightforward way to stop the compiler from making those
core dumps much less useful than they should be.  While it might be
conceivable to invent a special __coredump__ flag to make sure that the
generated code around such a call (including local variables) fully and
uniquely corresponds with the available debug information, that seems
like a quite more complex endeavor.

I did suggest removing __noreturn__ attributes on core dumping functions
some years ago, but was chased away without use of minced words since my
request was considered incompatible with the holy grail of optimization,
and talking about debugging should only be allowed on entirely
unoptimized code.  The kind of arguments and name-calling used for
putting a stop to my request were nothing I would have considered
compelling on technical grounds, so I am reraising that request in light
of assurances of a changed overall climate in glibc development.  I am
quoting a passage from Emacs "DEBUG" file that resulted from the
non-acceptance of my proposal.  I may add that _several_ Emacs
developers were afflicted by this problem and wasted several days on
different bugs each, so it is not academical.

    ** When you are trying to analyze failed assertions, it will be
    essential to compile Emacs either completely without optimizations or
    at least (when using GCC) with the -fno-crossjumping option.  Failure
    to do so may make the compiler recycle the same abort call for all
    assertions in a given function, rendering the stack backtrace useless
    for identifying the specific failed assertion.

Of course, it is not just glibc that is concerned here since GCC itself
has built-in definitions for some of those functions.  But those are
intended to follow the glibc in spirit, I should think, so I think that
one should start here concerning this issue.

Thanks for caring

-- 
David Kastrup


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]