This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCHv3] Protect _dl_profile_fixup data-dependency order [BZ #23690]

On 10/18/18 3:40 PM, Adhemerval Zanella wrote:
> I disagree, each possible user option we support incurs in extra
> maintainability and in this case the possible combination of current 
> trampoline types and arch-specific code increases even more the burden
> of not only provide, but to ensure correctness and testability.

I agree with you on this.

>> Thus enabling auditing should have as little impact on the underlying
>> application deployment as possible.
>> Forcing immediate binding for LD_AUDIT has an impact we cannot measure,
>> because we aren't the user with the application.
> I agree, but I constantly I hear that lazy-binding might show performance
> advantages without much data to actually to back this up. Do we have actual
> benchmarks and data that show it still a relevant feature?

There are two issues at hand.

(1) Lazy-binding provides a hook for developer tooling.

(2) Lazy-binding speeds up application startup.

We have concrete evidence for (1), it's LD_AUDIT, and latrace/ltrace, and
a bunch of other smaller developer tooling.

There is even production systems using it like Spindle:

Spindle has immediate examples of where all aspects of the dynamic loading
process are slowed down by large scientific workloads.

However, we don't have any good microbenchmarks to show the difference
between lazy and non-lazy. I should write some so we can have a concrete

I see rented cloud environments as places where lazy-binding would help
reduce CPU usage costs.

I see distribution usage of BIND_NOW as a security measure that while
important is not always relevant to users running services inside their
own networks. Why pay the performance cost of security relevant features
if you don't need them?

>> The point of these features is to allow for users to customize their choices
>> to meet their application needs. It is not a one-siz-fits-all.
>>> More and more distributions are set bind-now as default build option and
>>> audition already implies some performance overhead (not considering the
>>> lazy-resolution performance gain might also not represent true in real
>>> world cases).
>> Distribution choices are different from user application choices.
>> Sometimes we make unilateral choices, but only if it's a clear win.
>> The most recent case was AArch64 TLSDESC, where Arm decided that TLSDESC
>> would always be resolved non-lazily (Szabolcs will correct me if I'm wrong).
>> This was a case where the synchronization required to update the TLSDESC
>> was so costly on a per-function-call basis that it was clearly always a
>> win to force TLSDESC to always be immediately bound, and drop the required
>> synchronization (a cost you always had to pay).
>> Here the situation is less clear, and we have less data with which to make
>> the choice. Selection of lazy vs. non-lazy is still a choice we give users
>> and it is independent of auditing.
>> In summary:
>> - Selection of lazy vs non-lazy binding is presently an orthogonal user
>>   choice from auditing.
>> - Distribution choices are about general solutions that work best for a
>>   large number of users.
>> - Lastly, a one-size-fits-all solution doesn't work best for all users.
>> Unless there is a very strong and compelling reason to force non-lazy-binding
>> for LD_AUDIT, I would not recommend we do it. It's just a question of user
>> choice.
> My point is since we have limited resources, specially for synchronization
> issues which required an extra level of carefulness; I see we should prioritize
> better and revaluate some taken decisions. Some decisions were made to handle a 
> very specific issue in the past which might not be relevant for current usercases,
> where the trade-off of performance/usability/maintainability might have changed.

Agreed. I think we need some benchmarks here to have a real discussion.

> We already had some lazy-bind issues in the past (BZ#19129, BZ#18034, BZ#726),
> still have some (BZ#23296, BZ#23240, BZ#21349, BZ#20107), and might still contain
> some not accounted for in bugzilla for not so widespread used options (ld audit,
> ifunc, tlsdesc, etc.). These are just the one I got from a very basic bugzilla 
> search, we might have more.

I agree, it is compilcated by the fact that multiple threads resolve the symbols
at the same time.

> This lead to ask me if lazy-bind still worth all the required internal complexity
> and which real world gains we are trying to obtain besides just the option for
> itself. I do agree that giving more user choices are a better thing, but we
> need to balance usefulness, usability, and maintenance.

I don't disagree, *but* if we are going to get rid of lazy-binding, something
we have supported for a long time, it's going to have to be with good evidence
to show our users that it really doesn't matter anymore.

I hope that makes my position clearer.

In summary:

- If we are going to make a change to remove lazy-binding it has to be in an
  informed manner with results from benchmarking that allow us to give
  evidence to our users.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]