Fwd: [PATCH v5 00/22] Some rtld-audit fixes
Adhemerval Zanella
adhemerval.zanella@linaro.org
Fri Nov 19 19:56:25 GMT 2021
On 19/11/2021 16:18, Florian Weimer via Libc-alpha wrote:
> * Jonathon Anderson:
>
>>>> Right now, we
>>>> only require the program headers which we can obtain from
>>>> getauxval(AT_PHDR), however this technique has questionable
>>>> portability and robustness (getauxval returns an unsigned long, not a
>>>> pointer).
>
>>> A glibc port to an architecture where a long value cannot hold all
>>> pointer values will have to provide an alternative interface similar to
>>> getauxval, but that returns pointer values.
>
>> I would go one step farther and say getauxval is already broken for
>> any 64-bit architecture, unsigned long is only required to support 32
>> bits as per the C standards. One of my greater fears is that some
>> exotic compiler will cleverly allocate only 4 bytes of stack space for
>> the return value, and we wouldn't know except for a subtle bug
>> (dependent on optimization flags!) that crashes our entire tool with
>> SEGVs in the auditor (where GDB doesn't give properly unwound call
>> stacks).
>
> If ported to such an architecture, glibc would need several changes to
> accomodate this. Newer architectures take this into account and do not
> do funny things. But Morello, as a capabilities-based architecture,
> does not have this luxury, so they have to do something about this
> interface. But that is (still) an outlier.
>
> I think the important point is that glibc interfaces do not need to be
> fully API-compatible with future architecture requirements. We can
> change APIs for future ports.
>
>>> Of course that's not the
>>> only interface with this problem (ElfW(Addr) is an integer as well).
>
>> AFAICT ElfW(Addr) is fine, it should always be an integer large enough
>> to store a pointer on the host architecture (i.e. a uintptr_t). Unless
>> I missed some specific arch where this doesn't work out to be the
>> case?
>
> Morello and other capabilities-based architectures. Pointers need to
> pointers there. Weird architectures do not have uintptr_t.
>
>>> makes the Morello glibc port quite interesting. So I think *something*
>>> like getauxval (AT_PHDR) will always be available, with pretty much
>>> identical semantics.
>>>
>>>> From an outside perspective the current l_addr semantic is fairly
>>>> undocumented, the dladdr and dlinfo man pages define it vaguely as
>>>> the "difference between the address in the ELF file and the address
>>>> in memory." That sounds (to me at least) like l_addr should point to
>>>> byte 0 in the file (the ELF header), and that seems to be correct in
>>>> all but the non-PIE case.
>>> I have struggled with this in the past. I agree that it is confusing.
>>> l_addr is the offset between virtual addresses in the program header of
>>> the ELF object and the actual addresses in the process image. This
>>> offset happens to be 0 for ET_EXEC objects, and only there.
>
>> This is a much clearer description of the semantic, it would be very
>> helpful the man pages used that sentence (or one like it) wherever the
>> l_addr value is exposed in the API (link_map->l_addr and
>> dl_phdr_info->dlpi_addr). It would also be very helpful if
>> Dl_info.dli_fbase was clearly documented as *not* l_addr but instead
>> byte 0/ELF header in the process image.
>
> I've made a note to update the manual pages.
>
>> That does sound like the "correct" way out, but dl_iterate_phdr
>> operates on the caller's namespace so one would need to inject a shim
>> library to do the actual call.
>
> Ugh, you are right. It means we currently can't unwind across dlmopen
> boundaries.
>
> So please use getauxval (AT_PHDR) for now. It is fully portable across
> all present glibc targets.
>
>>>> dladdr gets its value from link_map->l_map_start instead of l_addr,
>>>> so the semantic we want is already present in a private field. It
>>>> seems to me these two fields could be swapped with little issue, if
>>>> altering the public semantic is not acceptable we could also be sated
>>>> if l_map_start was made public.
>>> Applications which know about the current semantics of l_addr will
>>> break, though. l_addr is also exposed to debuggers via the _r_debug
>>> interface. I really do not think we can make changes to l_addr.
>>> We have a similar issue around l_name being "" for the main program, and
>>> unfortuantely I will have to argue quite strongly against changing that.
>
>> Is adding new public fields completely off the table?
>
> To struct link_map? We could probably pull it off, but it would be
> years until such a change will be in the hands of the users. There is
> an internal structure that overlaps with the public struct link_map,
> and some applications poke at the private bits at fixed offsets.
>
> We've started not to strip ld.so downstream, so that these applications
> can switch to DWARF data to avoid dependencies on fixed offsets, but
> that has been a very recent change.
>
>> If I can humor the impossible for a few moments longer, I personally
>> have a difficult time believing that anyone actually uses
>> link_map->l_addr or link_map->l_name in a way that would break by
>> changing their semantics for the main executable:
>
>> - The documentation hasn't improved for years so there can't be many
>> users that care about (or even noticed) this case in particular.
>
> "" for the main executable is widely known. Usually code uses it to
> implement a fallback on argv[0] or /proc/self/exe, though.
There are still the issue where audit interface does not have direct
access to argv[0] from the audited process and '/proc' might also not
be accessible. I am still not convinced that provided argv[0] for
l_name for main executable is worse than "", specially because the
fallback might not work.
>
> Changing l_addr will break the libgcc unwinder. It uses l_addr to
> relocate the program header (see the code I quoted previously). Not
> everyone uses the platform unwinder, and the libgcc unwinder is
> sometimes linked statically. This is different from the l_name change:
> The l_addr would definitely cause widespread breakage.
>
>> - Every use case I can think of for obtaining a link_map from the dl*
>> functions (dlinfo and dladdr1) will either already have the special
>> handling, or won't operate on the main executable, or likely won't opt
>> to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).
>
> Some special-case the main executable based on l_name, I expect, which
> is why I'm so reluctant to change l_name. The GDB comment is actually
> hinting strongly towards a "" convention (that Solaris broke).
So I take that Solaris does provide the application name to l_name? And
what kind of breakage it has done on gdb?
More information about the Libc-alpha
mailing list