This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Removing longjmp error handling from the dynamic loader


* Rich Felker:

> I don't have a strong opinion (and maybe not enough information to
> have any opinion) on whether you keep longjmp or go with something
> else, but I don't think there's any fundamental reason you need to
> change this to fix current bugs. In musl, longjmp is used partly by
> historical accident (I don't recall fully, but I think it was a matter
> of adding dlopen to code that was originally written just for initial
> dynamic linking at program entry), and doesn't have problems like the
> ones you describe.

Interesting.

In glibc, we have many callouts into architecture-specific routines from
generic code.  Some of these routines throw exceptions, and which ones
do is not always entirely clear.

For example, if there is a temporary memory allocation which persists
across such a callout, do we have to install a local exception handler
to clean up that allocation in case the helper routine throws?

If the error handling is expressed in the function signature (using that
exception pointer parameter), the behavior is much more explicit and we
can avoid these issues more easily.

> For the init/fini "soft errors" problem, it sounds to me like the code
> that runs the ctors should just be outside of the scope of the
> _dl_catch_error. If you've started running ctors, you're past the
> point where the operation can be backed out in any meaningful sense.

That's certainly true.  This one should be rather easy to fix.  It also
affects only dlopen/dlclose, at which point we can assume that we have
our full exception handling implementation.

> I wonder if it's worse with ifunc, but I think not -- without
> bind_now, the ifunc resolvers don't even need to run before you pass
> the point of no return, and with bind_now, you'd be executing them in
> a context where resolver errors are still "hard" and can/should cause
> dlopen to fail.

The question is whether a failure from the run-time trampoline should
ever be a soft error (that can be caught by dlopen).  I think we use the
soft/hard distinction differently, but you seem to suggest that a lazy
binding error during a call to an IFUNC resolver should not cause
process termination.  I think it is undefined like any other trampoline
failure, so we should abort.  We really don't know what the IFUNC
resolver was supposed to be doing and which of its side effects
happened.  The situation really is unrecoverable.

If we want to give users more precise control over binding errors, I
don't think anything based on SJLJ-style exception handling is the
answer.  From a technical perspective, it should be feasible to throw
something that can actually be caught from C++ code, but given the
current pervasive climate against table-driven stack unwinding for error
reporting, I'm not sure this is something that would be a good use of
anyone's time.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]