This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Avoid deadlock in malloc on backtrace

I am not convinced about the utility of backtrace in the heap
corruption cases, so I'm going to try and argue in favour of its
removal.  From the review, Carlos seems OK with the patch but not the
test case, so I'll work on the latter.

TL;DR; there are two distinct issues to think about and we should
tackle them separately:

1. How useful is the backtrace in a memory corruption case

2. How do we ensure that backtrace is robust when called from weird
   contexts, especially from within a signal handler.

On Thu, Feb 26, 2015 at 05:18:04PM -0500, Carlos O'Donell wrote:
> There have been various comments about this patch and I would like to
> list some supporting rationale for this patch:
> (1) Delaying the abort is bad for security.
> Problem: The library should abort() immediately from a security perspective.
> Solution: Don't delay, call abort() immediately.
> We all agree that delaying the abort is bad from a security perspective.
> However, the present solution is a trade-off between providing useful
> diagnostics and *then* aborting.
> I'm against removing the corruption detection messages because they
> are useful for *me*, and when abrt doesn't work or doesn't work
> correctly (see 3).

I don't think the argument is for removing corruption detection
messages.  It is for removing the backtrace and map info from it since
the former touches the heap and the latter is mostly useless without
the former.  The corruption detection message remains and IMO, it
gives sufficient information about what kind of corruption caused the
program exit.

Removing the backtrace removes the information about where in the
program the error occured, but that is mostly useless because the
point at which the corruption is detected is almost never the point
where the corruption happened.

> The best argument is that the default should be to call abort()
> immediately, and I don't disagree with that, but I still find the
> backtrace useful, and supportable, and would argue to keep it in
> place as a choice to developers and system integrators.
> At present if you want to log something during an abort() sequence
> you register a SIGABRT handler, longjmp to somewhere safe, try to
> log something, and then _exit(). You may object to this, but it's
> a valid sequence of events that we support today, and the only way to
> robustly support this without a custom malloc() is to make malloc()
> more robust against corruption *or* provide a way for the user to
> indicate that robust malloc is now desired (after the return from
> the longjmp). Either way the infrastructure added here is needed.
> In summary: Be conservative, keep the backtrace, consider making
> it non-default, followed by potential switch to prevent abuse of
> robust malloc.

I don't disagree with the idea of robustifying backtrace since that is
useful for other applications.  Howeber, to make it truly robust, a
backtrace call should never result in a malloc since it is often
called in a signal context.

We need to ensure that either the dynamic linker does not use malloc
at all in that context, or the dynamic linker does not come into the
picture at all when we're calling backtrace.  Robustifying malloc to
try and robustify backtrace seems like too tall an order.  We'll
continually run into different kinds of issues till we make malloc
completely async-cancel-safe.

> (3) Use a coredump handler.
> - Like the Fedora abrtd.
> Doesn't support 3rd party ISV applications that aren't packed
> with the OS.
> Fedora defaults /etc/abrt/abrt-action-save-package-data.conf to:
> ProcessUnpackaged = no
> Won't change any time soon and therefore means the in-process
> backtrace continues to be useful.

A coredump handler is just gravy on top of the actual feature,
i.e. dumping core, which has been there since before I was born.  A
core dump provides way more information than the backtrace can ever
hope to provide and the fact that we abort on exit, guarantees that we
get a core dump if it is enabled.  If it helps, we can add a message
telling the user to enable core dumps if they haven't already.

> (5) Dynamically link with _Unwind_Backtrace to avoid dlopen's malloc.
> - Link dynamically with in order to have immediate access
>   to the unwinder.
> Complicates the bootstrap process. Loading early also penalizes all
> applications when only those that want backtrace should have a need
> to load

A new function backtrace_init() could help break this deadlock.  Or
something similar that loads so that backtrace doesn't
result in dynamic linker invocation during the backtrace call.

> > +#include <stdlib.h>
> > +
> > +#define SIZE 4096
> > +
> > +/* Wrap free with a function to prevent gcc from optimizing it out.  */
> > +static void
> > +__attribute__((noinline))
> > +call_free (void *ptr)
> > +{
> > +  free (ptr);
> > +  *(size_t *)(ptr - 8) = 1;
> Nit: Document magic constant. Why did you choose it?
> Wouldn't a more compiler-safe test use memset to destroy
> the inter-block canaries? Knoing ptr1 and ptr2's location
> you could more easily zero out the space inbetween without
> fear of a SIGSEGV, knowing the pages would be mapped on the
> primary heap. This would corrupt the block and that corruption
> would be detected at free.
> This would not be easily detected or removed by the compiler?

OK, I'll see if I can change this.


Attachment: pgpgyYwRRf2mi.pgp
Description: PGP signature

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]