On 06/18/2016 04:01 AM, Suprateeka R Hegde wrote:
On 18-Jun-2016 11:02 AM, Carlos O'Donell wrote:
On 06/18/2016 12:11 AM, Suprateeka R Hegde wrote:
All I am saying is, dlopen(3) with RTLD_GLOBAL also should bring in
foo at runtime to be compliant with POSIX.
I disagree. Nothing in POSIX says that needs to be done. The
key failure in your reasoning is that you have assumed lazy
symbol resolution must happen at the point of the first function
call.
ld(1) on a GNU/Linux machine says:
---
-z lazy
When generating an executable or shared library, mark it to tell the
dynamic linker to defer function call resolution to the point when
the function is called (lazy binding)
---
Note that those man page is part of the linux man pages project and
are not canonical documentation for the glibc project. Often the man
pages documentation goes too far in describing the implementation
and beyond what is guaranteed. We can work with Michael Kerrisk to
get this changed quickly to read "defer function call resolution
to an implementation-defined point in the future, possibly as late
as the point when the function is called (lazy binding)."
This made me think that GNU implementation also matches with other
implementations -- that is lazy resolution happens at the time of the
first call.
That is not an assumption that developers should be making.
You have read "shall be made available for relocation" and
then used implementation knowledge to decide that _today_ those
relocations have a happens-after relationship with dlopen in your
program. But because lazy symbol resolution is not an observable
event for a well-defined program,
Yes. I agree very much. But making some massive enterprise legacy
application to become "well-defined" now is beyond tool chain
writers.
I agree that inevitably applications of a certain size end up having
dependencies on implementation details that in turn make them costly
to port to other operating systems.
I care a lot about our users, and I don't want to see implementations
constrained by standards text that might limit benefits to them in
the future. So any suggestions you have I'm going to weigh against
what I think a sensible user might expect, not a singular enterprise
application.
if the application uses say "execve" and decide if access control,
in a policy-less environment, needs to be disabled (execve disabled
unless the application needs it).
You argue that we should standardize on "bind now" which happens
immediately at startup, and "lazy binding" which always happens
at the time the function is called, ignoring any opportunisitic
binding that might happen if the dynamic loader happens to prove
it knows what the binding result will be.
No, if anything, I think we should be less proscriptive about
lazy binding.
However, because POSIX says nothing about when the lazy symbol
resolution happens, or anything at all about it,
It indeed says something:
Only for dlopen...
---
RTLD_LAZY
Relocations shall be performed at an implementation-defined time,
ranging from the time of the dlopen() call until the first reference
to a given symbol occurs
---
... and it says nothing really, like it should, leaving the choice
up to the implementation. This text is specifically geared towards
shared objects loaded via dlopen, not the symbols in the binary, for
which the standard says nothing.
And then based on the ld(1) manpage, I thought GNU/Linux
implementation uses the time of first call.
It does, but it doesn't use symbols brought into the global scope
by dlopen for this resolution.
What is the harm if we go by the existing documentation and under the
option -z lazy or RTLD_LAZY, make lazy resolution happen at the point
of function call?
You forbid a mixed binding environment, you forbid opportunistic binding,
and force the binding to be truly as late as possible.