This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fw: Questions about VDSO


> > There is no configuration that requires a vdso.
> > That's what DL_SYSINFO_DEFAULT is for.
> 
> I did a test patch a while ago that did sort-of that. Defined
> NEED_DL_SYSINFO an defined the other macros as doing nothing.

That is not the clean way to go about it.  That defines unnecessary struct
fields and so forth.

> Hrm... You mean the kernel shouldn't pass AT_SYSINFO ? 

Correct.  If there is no syscall entry point, then there is no meaningful
value to give in AT_SYSINFO. 

> I'm not sure anymore now, but I think I saw a few assumptions in Glibc

We can fix that.

> Yes. That is one of the main reasons. I have more: the vDSO provides a
> fully userland & lock-less implementation of gettimeofday for example. 

That doesn't mean much to libc internals until you propose a specific way
for libc to use that entry point.

But if you do already need trampoline unwind info, then that is something
we can support independent of future things, just by recognizing the vdso.

> The main issue I had last time I tried to push the subject of
> interfacing the PPC vDSO with glibc was the actual calling convention
> from userland to the vDSO. Because our ABI has OPD decscriptors for
> function calls and those contain absolute addresses (at least on ppc64),
> that means that if we use normal linking against the vDSO (which
> actually works), the vDSO has to be mapped in userland at the same
> virtual address it was "linked" for. That excludes randomization. 

There is little enough relocation to be done, you can do it automagically
in the kernel if you want to.

> For some reason, we wanted to have it in low addresses, which mean we
> also need a way (via a phdr maybe) to disabling instanciating the vDSO
> for a given process or to map it elsewhere for the few things that need
> precise control over their virtual address space.

Things that need precise control should use PT_LOAD segments to reserve space.
Your vdso mapping should not clobber any such reservations.

> It's possible to hack one line in ld.so to have it do relocation. In
> this case, it will mprotect(), which is supported by my implementation,
> and will trigger COW.

It doesn't make sense to automagically map a page read-only if it's only
useful when modified.  If you want to do that, then provide it read/write
in the first place.  Still then another syscall is required for relro,
desireable safety.  It seems preferable to just have the kernel provide a
read-only vdso that works.

> Another option is to have a non-normal calling convention to the vDSO.
> This can be either the vDSO exporting "offsets" or  just 0-based OPDs
> and having glibc/ld.so use special branch trampolines to call into it.
> The disadvantage is that it will add overhead to calls that are
> optimisations for perf critical things in the first place (like
> gettimeofday, cache flush, memset/memcpy, etc...)


> Ideally, it would be nice if we could define some kind of relocation &
> appropriate symbol versioning (so glibc can still override the vDSO if
> necessary) so that ld.so directly links the app calls to those routines
> to the vDSO when it's available.

What exactly do you mean here?  The vDSO defines symbols and symbol
versions just like any other DSO.  

> ld.so itself need a non-link way to call into it as well (symbol lookup
> + call via function pointers ?) for things like the cache flushing.

This doesn't really need to be something other than normal linking.

> So as you can see, there are plenty of issues to solve at this point.

Let's do what's clear first, and work incrementally.  If you now have a
vdso with useful unwind info for signal trampolines, let's concentrate on
making that work right.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]