This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Protect _dl_profile_fixup data-dependency order [BZ #23690]
- From: Carlos O'Donell <carlos at redhat dot com>
- To: John David Anglin <dave dot anglin at bell dot net>, Tulio Magno Quites Machado Filho <tuliom at linux dot ibm dot com>, libc-alpha at sourceware dot org, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Date: Thu, 11 Oct 2018 17:15:42 -0400
- Subject: Re: [PATCH] Protect _dl_profile_fixup data-dependency order [BZ #23690]
- References: <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org>
On 9/20/18 9:34 AM, John David Anglin wrote:
> On 2018-09-19 10:00 PM, Carlos O'Donell wrote:
>> On 09/19/2018 09:59 PM, John David Anglin wrote:
>>> On 2018-09-19 8:16 PM, Carlos O'Donell wrote:
>>>>> DL_FIXUP_VALUE_TYPE *resultp = &reloc_result->addr;
>>>>> + DL_FIXUP_VALUE_TYPE value = atomic_load_acquire(resultp);
>>>> You are potentially requiring an atomic load of a structure whose size can be
>>>> an arbitrary size depending on machine and ABI design (64-bit fore 32-bit hppa,
>>>> and 128-bit for 64-bit ia64). These architectures might not have such wide
>>>> atomic operations, and ...
>>> We have implemented 64-bit atomic loads and stores for 32-bit hppa. They are not well tested but
>>> they might work. They use floating point loads and stores, and kernel helper. The code is pretty horrific :-(
>> We only need to use the fdesc->ip as the guard, so we don't really need the 64-bit
>> atomic, but other algorithms like the new pthread condvars can use them effectively
>> to accelerate and avoid 2 lws kernel helper calls and instead use 1 lws kernel helper
>> 64-bit atomic.
> Regarding using fdesc->ip as the guard, The gp is loaded both before and after the ip on hppa.
> For example, $$dyncall loads gp before the branch. This could be changed at the cost of one
> instruction. Stubs load gp after ip. I don't think this is easy to change.
This is fine. The point is that ip is the guard, and so all other elements of the fdesc (only
gp in this case) need to be written to before the guard is updated to indicate that the
relocation of the fdesc is complete.
This is now fixed with Tulios changes which just use an atomic_thread_acquire and
atomic_thread_release in the _dl_profile_fixup.