This is the mail archive of the
mailing list for the glibc project.
Re: [PATCHv3] Protect _dl_profile_fixup data-dependency order [BZ #23690]
On 18/10/2018 15:43, Carlos O'Donell wrote:
> On 10/18/18 2:21 PM, Adhemerval Zanella wrote:
>> On 18/10/2018 10:39, Carlos O'Donell wrote:
>>> On 10/18/18 3:24 AM, Florian Weimer wrote:
>>>> * Carlos O'Donell:
>>>>> The use of the guard+fence-to-fence sync is, from a C11 perspective,
>>>> I really don't think this is true:
>>>> | Two expression evaluations conflict if one of them modifies a memory
>>>> | location and the other one reads or modifies the same memory location.
>>>> (C11 184.108.40.206p4)
>>>> | The execution of a program contains a data race if it contains two
>>>> | conflicting actions in different threads, at least one of which is not
>>>> | atomic, and neither happens before the other. Any such data race
>>>> | results in undefined behavior.
>>>> (C11 51.2.4p25)
>>>> We still have unordered conflicting non-atomic writes after Tulio's
>>>> patch. I don't think they matter to us. But this is *not* correct for
>>> I agree completely. My point is that the change, the specific lines Tulio
>>> is touching, and the changes made, are correct, a fence-to-fence sync
>>> requires an atomic guard access. I agree it doesn't fix the actual problem
>>> of multiple threads doing the same updates to the reloc_result.
>>> glibc is *full* of data races, and that doesn't mean we will just give up
>>> on using C11 semantics until we can fix them all. Any changes we do, we
>>> should do them so they are correct.
>>> It really feels like we agree, but we're talking past eachother. Did my
>>> previous email clarify our positions and which one I choose and why?
>>> If I didn't understand your position correctly, please correct what I
>>> wrote so I can understand your suggestion.
>> Wouldn't just disable lazy-resolution for LD_AUDIT be a simpler solution?
> This is not the question I would ask myself in this case.
> Consider that auditing is independent of the manner in which the application
> is deployed by the user (built with or without lazy binding).
I disagree, each possible user option we support incurs in extra
maintainability and in this case the possible combination of current
trampoline types and arch-specific code increases even more the burden
of not only provide, but to ensure correctness and testability.
> Thus enabling auditing should have as little impact on the underlying
> application deployment as possible.
> Forcing immediate binding for LD_AUDIT has an impact we cannot measure,
> because we aren't the user with the application.
I agree, but I constantly I hear that lazy-binding might show performance
advantages without much data to actually to back this up. Do we have actual
benchmarks and data that show it still a relevant feature?
> The point of these features is to allow for users to customize their choices
> to meet their application needs. It is not a one-siz-fits-all.
>> More and more distributions are set bind-now as default build option and
>> audition already implies some performance overhead (not considering the
>> lazy-resolution performance gain might also not represent true in real
>> world cases).
> Distribution choices are different from user application choices.
> Sometimes we make unilateral choices, but only if it's a clear win.
> The most recent case was AArch64 TLSDESC, where Arm decided that TLSDESC
> would always be resolved non-lazily (Szabolcs will correct me if I'm wrong).
> This was a case where the synchronization required to update the TLSDESC
> was so costly on a per-function-call basis that it was clearly always a
> win to force TLSDESC to always be immediately bound, and drop the required
> synchronization (a cost you always had to pay).
> Here the situation is less clear, and we have less data with which to make
> the choice. Selection of lazy vs. non-lazy is still a choice we give users
> and it is independent of auditing.
> In summary:
> - Selection of lazy vs non-lazy binding is presently an orthogonal user
> choice from auditing.
> - Distribution choices are about general solutions that work best for a
> large number of users.
> - Lastly, a one-size-fits-all solution doesn't work best for all users.
> Unless there is a very strong and compelling reason to force non-lazy-binding
> for LD_AUDIT, I would not recommend we do it. It's just a question of user
My point is since we have limited resources, specially for synchronization
issues which required an extra level of carefulness; I see we should prioritize
better and revaluate some taken decisions. Some decisions were made to handle a
very specific issue in the past which might not be relevant for current usercases,
where the trade-off of performance/usability/maintainability might have changed.
We already had some lazy-bind issues in the past (BZ#19129, BZ#18034, BZ#726),
still have some (BZ#23296, BZ#23240, BZ#21349, BZ#20107), and might still contain
some not accounted for in bugzilla for not so widespread used options (ld audit,
ifunc, tlsdesc, etc.). These are just the one I got from a very basic bugzilla
search, we might have more.
This lead to ask me if lazy-bind still worth all the required internal complexity
and which real world gains we are trying to obtain besides just the option for
itself. I do agree that giving more user choices are a better thing, but we
need to balance usefulness, usability, and maintenance.
> I also think that the new reloc_result.init field can now be used to
> implement a lockless algorithm to update the relocs without data races,
> but it would be "part 2" of fixing P&C for LD_AUDIT.