This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Invalid program counters and unwinding
- From: Jeff Law <law at redhat dot com>
- To: Florian Weimer <fweimer at redhat dot com>, GCC <gcc at gcc dot gnu dot org>, GNU C Library <libc-alpha at sourceware dot org>, Binutils <binutils at sourceware dot org>, gnu-gabi at sourceware dot org
- Date: Wed, 27 Jun 2018 20:16:05 -0600
- Subject: Re: Invalid program counters and unwinding
- References: <ae764484-5bd4-5e40-ed50-81209eb54360@redhat.com>
On 06/26/2018 03:26 AM, Florian Weimer wrote:
> I'm looking at ways to speed up _Unwind_Find_FDE when libgcc is running
> on top of glibc. I have something (at the design level, with some of
> the code written) which allows me to get a pointer to the
> PT_GNU_EH_FRAME segment in memory in a lock-free fashion (so it would
> also be async-signal safe).
>
> This part works also when the program counter used in the search is
> invalid and does not point to within a loaded object, even in the case
> of concurrent dlopen/dlclose.
>
> However, it's still necessary to read the PT_GNU_EH_FRAME data itself,
> and if _Unwind_Find_FDE is not a valid program counter found on the
> stack (with in a caller, where unmapping it with dlclose would be
> invalid), it could happen that it is a random address in *another*,
> unrelated object, which then gets dlclose'd (which is valid).
>
> The current glibc-based implementation in libgcc calls dl_iterate_phdr,
> which acquires a lock blocking dlclose for the entire duration of the
> iteration. But I think this still doesn't support arbitrary, random PC
> values because in the worst case, the PC value looks valid, we find some
> unrelated FDE data with an associated personality routine, and end up
> calling that, with disastrous consequences.
>
> So it looks to me that the caller of _Unwind_Find_FDE needs to ensure
> that the PC is a valid element of the call stack. Is this a correct
> assumption?
>
> I have some ideas how make reading the PT_GNU_EH_FRAME data safe, but
> the question is whether we actually need that.
>
> Previous discussions:
>
> https://gcc.gnu.org/ml/gcc/2013-05/msg00253.html
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744
> https://sourceware.org/ml/libc-alpha/2016-07/msg00613.html
> (patch with a spread lock, still not async-signal-safe)
You might also want to look at RH BZ 1293594 which I think has pointers
back to an issue from 2008 :(
Jeff